Keywords

1 Introduction and Learning Objectives

We as readers experience reading as if we smoothly slid our eyes from one text line to the next. However, our intuition does not hold true. Stable periods where the eyes stay relatively still are frequently interspersed with abrupt, fast movements of the eyes. Stable periods are called fixations and movements are called saccades, which are the fastest motor movements humans can make. A typical eye movement pattern of an adult reading a sentence in Finnish is depicted in Fig. 7.1.

Fig. 7.1
figure 1

A typical eye movement pattern of an adult reading a sentence in Finnish. Dark circles depict fixations, the number attached to it its duration, and the white arrows depict saccades and their direction

During reading we also experience large parts of the written text to be readily available for us to take in. Also this intuition is incorrect. The foveal area of the eye where the visual acuity is best is very limited in scope, only about 2° of visual angle. Hence, we can typically identify only one or two words at a time. In order to bring the foveal area to an optimal location for word identification, we need to make frequent saccades from one word to the next. Even though we make frequent fixations in the text, the individual fixations are short, typically lasting for 200–250 ms among competent readers. Thus, adult readers make 4–5 fixations in a second. A typical saccade extends about 7–10 letters in reading an alphabetic language. However, saccade length depends on the orthographic properties of the language; for example, they are much shorter when reading a logographic script like Chinese. Box 1 provides a summary of the basic aspects of eye movements during reading (Rayner, 1998, 2009).

Box 1: Basic Features of Eye Movements During Reading

Fixation = the period of time when the eyes remain relatively stationary in one place in text. Intake of visual input takes place during fixations. Average fixation duration in reading is about 200–250 ms.

Saccade = a fast, ballistic eye movement that take the eyes from one word to another. Due to the limitations of the foveal area (which extends only about 2° of visual angle around the point of fixation), readers need to make a series saccades to visually sample the written text. Saccades typically last for about 20–40 ms in reading; its average length in reading alphabetic script is about 7–10 letters (somewhat depending on the script).

Return sweep = a long-range saccade that take the eyes from the end of one text line to the beginning of next line. It is typically followed by a short corrective saccade to the left (when reading from left to right).

Saccadic suppression = during saccades, no visual input is acquired, so in that sense the reader is functionally blind during the saccades. It is utilized in the boundary paradigm (see Box 2). If a change is made in the text during a saccade, the reader does not notice the actual change taking place.

Regression = a saccade launched to the opposite direction from the normal reading direction. Short, corrective regressions often appear after a return sweep or after launching a saccade that lands too far to the right of the word’s center (i.e., the optimal viewing position). Longer regressions (for more details, see the Section on Eye movements in text reading) are made in the service of comprehension monitoring: when the reader (a) has misunderstood something, (b) has forgotten something, (c) needs to resolve an inconsistency between two text elements, (d) wants to refresh his/her mental representation of the text by taking a second sample of a text region, or (e) is not ready to move on to a new text region (e.g., from one sentence to the next).

As the readers fixate on nearly every word (on longer words even more than once), an eye movement record provides a researcher with a real-time protocol of how the reader proceeds with his/her eyes through the text. Thus, readers’ eye fixation patterns have successfully been used to investigate various mental processes ongoing during reading. This is made possible by a close link existing between where the eyes are gazing at and what the mind is engaged with. This phenomenon is referred to as the eye-mind hypothesis (Just & Carpenter, 1980). Difficulty in processing may be reflected in longer and/or more fixations on the text region requiring extra effort to be comprehended. The reader may also go back in text when realizing that (s)he has misunderstood something in the previous text or (s)he would like to double-check his/her understanding of previous text. All this is faithfully reflected in readers’ eye movements that can then be used to tap into the mental processes ongoing during reading. To date, readers’ eye movement recordings have been very productively put to use for studying word recognition, the size and nature of the effective visual field, and syntactic parsing of sentence and clause structure among competent adult readers. Fewer eye movement studies have examined comprehension processes ongoing when reading longer texts.

Recently, an increased number of studies have been devoted to using eye movements to investigate how the reading skill evolves during the initial stages of skill development (for a review, see Blythe, 2014; Blythe & Joseph, 2011). The interest here has been to study (a) how reading development is reflected in the eye movement record and (b) what added value may be gained by using eye-tracking to study early reading development. Hence, in this chapter we also review recent studies focusing on reading development among normally developing children (for atypical development, see Klein et al., this volume).

In what follows, we review key findings of eye movement studies on word recognition, sentence parsing and text comprehension (for more comprehensive reviews, see Rayner, 1998, 2009) among competent readers and to some extent also among developing readers. Our focus is on reading alphabetic scripts. Readers interested in Chinese reading may consult, for example, the special issue edited by Liversedge, Hyönä, and Rayner (2013) and the review of Zang, Liversedge, Bai, and Yang (2011).

After reading this chapter, the learner knows:

  • What the major determinants of fixation time are on individual words.

  • What type of information is extracted from the parafovea during reading.

  • What kind of eye movement paradigms have been developed to study reading, and what is the logic behind them.

  • How visual attention and eye movements are related to each other.

  • What serial and parallel models of eye movement guidance are and how they differ from each other.

  • How eye movements of younger, less skilled readers differ from those of more mature adult readers.

  • How sentence comprehension processes are reflected in readers’ eye movements.

  • How good comprehension (in comparison to poor comprehension) is reflected in eye movements during text reading.

  • How the reading task influences readers’ eye movements.

2 Historical Annotations

The history of eye movement research can be roughly divided into three eras: (1) the early days (from the end of 1870s to 1930s) when the basic facts about the nature of eye movements during reading were discovered, (2) the time of the behaviorist movement (1930s–1956) that was characterized by fewer, yet very interesting, studies of eye movements and reading, and (3) the cognitive era (from 1956 onwards), which is characterized by remarkable methodological and technological advancements. A brief timeline of the most important events is presented in Fig. 7.2 (see also Rayner, 1978, 1998; Wade & Tatler, 2009).

Fig. 7.2
figure 2

A time line of the most important events in the history of eye movement research

The history of observing eye movements during reading date back to the end of 1870s, when the “jerky” nature of readers’ eye behavior was documented independently by two researchers, a Frenchman M. Lamarre, and a German researcher E. Hering (Wade & Tatler, 2009). The first eye tracking devices were fairly crude mechanical systems that were based on detecting eye movements via a mirror and concurrently detecting sounds produced by saccadic movements via a rubber tube placed on the eyelid (see Wade & Tatler, 2009). Despite their simplicity, the systems were accurate enough to provide basic facts about eye movements during reading. These initial findings are reported in a book by Huey (1908), which can be regarded as one of classics in the field.

After the initial stages, there was a slight decrease in the use of eye movement recordings to study reading, which has been connected with the rise of the behaviorist movement. However, the research topics that were covered at that time are still timely. For example, Tinker (1958) reviews studies that examined the influence of text difficulty on eye fixations, compared eye movement patterns during reading in 14 different languages, observed differences in fixation times induced by various reading tasks, and investigated developmental trends in eye movements during reading.

As the cognitive revolution started in the mid-1950s, several methodological and technological advancements helped to spark new research. The introduction of gaze-contingent display paradigms (see Box 2) provided new tools for studying the control of eye movements during reading. Since then, eye tracking devices have become easier to use without compromises to the data quality, and trackers have now even been combined with brain imaging techniques, such as EEG (e.g. Baccino & Manunta, 2005) and fMRI (e.g., Brown, Vilis, & Everling, 2008).

3 Theories of Eye Guidance in Reading

Over the years, eye movement research in reading has gathered a wealth of data that has allowed scientists to propose detailed, mathematically formulated models of eye guidance during reading. The proposed theories model local eye movement patterns pertaining to word processing. They are built to model (a) foveal word processing (i.e., eye movement patterns on words when they are fixated) as well as (b) parafoveal word processing (i.e., the extent to which processing extends to words adjacent to the fixated word). Next, key aspects of the two most prominent models are described. This is followed by a summary of what is currently known about foveal and parafoveal word processing, as revealed by readers’ eye movements.

The most influential model of eye guidance during reading is the E-Z Reader model put forth by Reichle et al. (1998, 2003; Pollatsek et al., 2006). It models fixation durations on words and saccade programming between words. It is a serial model in that attention is allocated to only one word at a time. Another key tenet of the model is that shifting of attention and saccade programming are decoupled. When a word is fixated, attention is allocated to that word in order to access its identity. Once this lexical access is imminent, and processing has reached a state where the word looks familiar and is about to be identified, a saccade is programmed to the next word. However, attention stays at the fixated word until the lexical access has completed. The familiarity check is reached sooner when the word is highly familiar to the reader or is predictable from the prior text context. This way the model simulates the well-established word frequency and predictability effects discussed in more detail below.

Due to saccadic latency, some time elapses before a saccade is executed after it is programmed. This is called the labile stage: a saccade is initiated but not yet executed, and it is still possible to cancel and reprogram it. If lexical access is completed for the currently fixated word, which we will call the Word N, and a saccade to the next word (Word N + 1) has not yet been executed, attention shifts to Word N + 1 before the eyes. This way processing of Word N + 1 is initiated prior to its foveal inspection. This is the time period when Word N + 1 is parafoveally processed. During this time, the eyes and attention are decoupled. In the relatively rare cases when there is sufficient time for the familiarity check to complete also for Word N + 1, the following saccade to Word N + 1 may be canceled and reprogrammed to Word N + 2, in which case Word N + 1 is skipped. This will take place if the familiarity check for Word N + 1 occurs during the labile stage when the saccade may still be reprogrammed. If it completes after that, during the so-called non-labile stage, the saccade can no longer be reprogrammed, but a saccade is executed to Word N + 1, followed by a short fixation on it prior to saccading to Word N + 2. The E-Z Reader model also takes into account visual acuity in modeling fixation durations on words. The further the individual letters of the word are from the fixation center, the longer the familiarity check takes.

As described above, the E-Z Reader model assumes words to be attended serially one at a time. Attention does not shift to the next word before the currently fixated word is identified. Its prime competitor, the SWIFT (Autonomous Saccade-Generation With Inhibition by Foveal Targets) model (Engbert et al., 2005) challenges this assumption by postulating that multiple words (up to four) may be simultaneously attended. This is achieved on the basis of a dynamically determined processing gradient. Not only the fixated word is activated, but also activation related to Word N − 1 may linger on, if it is not completely identified. Moreover, Word N + 1, and possibly Word N + 2 may receive activation, depending on the scope of the attentional gradient at any one time. Due to the dynamic nature of the attentional gradient, processing may also be limited to a single word, comparable to serial processing assumed by E-Z Reader.

Another key difference between the two models is that SWIFT assumes saccade initiation to be a stochastic process, rather than determined by word processing difficulty, as assumed by E-Z Reader. Yet, word processing can impact on saccade initiation by inhibiting a saccade when lexical access is incomplete. This way SWIFT is able to simulate effects of word frequency and predictability known to affect fixation time on words. However, it is done differently from E-Z Reader. While E-Z Reader assumes lexical processing to be the driving force of eye guidance during reading, in SWIFT lexical processing influences saccade programming by delaying stochastic saccade programming in response to processing difficulty.

A third key difference between the models concerns the mechanism responsible for determining saccade targets. As described above, according to E-Z Reader the target for the next saccade from Word N is Word N + 1, unless the familiarity check has sufficiently early reached completion also for Word N + 1, in which case Word N + 2 is targeted. According to SWIFT, on the other hand, saccade targets are selected on the basis of the dynamic lexical activation pattern within the attentional gradient. Target selection occurs as a competition between words that possess a variable degree of lexical activation. “Activation is built up in a preprocessing stage and decreases during a later lexical completion process. The relative amount of activation will determine the probability that a word is selected as a saccade target” (Engbert et al., 2005, p. 782). In other words, the word with the highest lexical activation at a given time will be selected as the target for the next saccade. For example, if the fixated word is difficult to process, lexical access is not completed and its activation remains high. As a consequence, it may win the competition for the next saccade, resulting in a refixation of that word. On the other hand, when a word is easy to process, the lexical access is achieved rapidly resulting in decay in activation. In such a case, a word in the parafovea will have more activation than the fixated word; thus, a saccade is targeted to it. In such a way, using the same underlying mechanism SWIFT is in a position to model all types of saccades: saccades to the following word, refixations of the same word, word skipping, and regressions back to a previous word.

It is important to note that both E-Z Reader and SWIFT model can successfully simulate key findings related to eye movements in reading (discussed below). They do that by postulating different mechanisms governing fixation durations and saccade programming. They are both specified in a series of mathematical equations that make possible to estimate their predictions to the observed data. They both also make new predictions for which no data are available. Thus, they have the potential to move research forward in a theory-driven way. Finally, it should be noted that E-Z Reader and SWIFT are not the only models of eye guidance in reading. An interested reader may consult Reichle (in press), who provides a comprehensive review of all major theories of reading, including also theories not making recourse to readers’ eye movements.

4 Eye Movement Paradigms Used to Study Reading

Eye movement recordings can be utilized in various ways to study reading-related cognitive processes. In a typical reading study, participants are asked to read single sentences silently for comprehension. Fixation durations on words and saccade lengths are then computed to examine how different word characteristics, which can be experimentally varied, influence the eye movement patterns (Rayner, 1998). Comprehension is periodically tested by either asking participants to paraphrase the sentence, or by asking readers to respond to statements regarding the sentence contents. Reading of texts consisting of more than one sentence can be studied by presenting multiple lines of text on one page, and the text may also extend to more than one page. Depending on the accuracy of the eye tracker, the line spacing is adjusted to be wide enough (i.e., 2.5 line spacing) in order to reliably differentiate which line of text the reader is currently fixating on. In addition to computing fixation durations on words, it may be necessary to compute fixation durations on phrases or sentences (Hyönä, Lorch, & Rinck, 2003). After reading, a measure of comprehension (e.g. free recall) is collected to check that the reader was engaged with the reading task; it is also fruitful to examine the relationship between comprehension and the way that the text was inspected.

In addition to simply recording eye movements during reading of sentences or texts, the different applications of the gaze-contingent display change paradigm (see Box 2) can be used to examine in more detail the cognitive processes underlying reading. In these paradigms, eye gaze is constantly tracked with an eye tracker, and changes to the text display are made depending on the location and direction of the movement of the eye gaze. Studies utilizing these paradigms reveal interesting facts about the interplay between vision and cognition during reading: how much information can be extracted on one fixation, how much time is needed for different types of information to be extracted from a word, what kind of information can be extracted from the parafovea, and whether regressive eye movements are needed for comprehension.

Box 2: Gaze-Contingent Eye Movement Paradigms

Examples of gaze-contingent eye movement paradigms are presented in Table 7.1. In the examples, a circle indicates fixation location.

Table 7.1 Examples of the gaze-contingent display change paradigms. The gray circle marks the fixation location

In the disappearing text paradigm (Liversedge et al., 2004), the fixated word either disappears or is masked after the fixation onset, e.g. after a 40 ms delay. By varying the time of disappearance, it is used to estimate the minimum exposure time needed for reading to continue with normal speed.

In the fast priming paradigm (Sereno & Rayner, 1992), the target word is initially replaced with a prime stimulus. After a short delay (e.g. 30 ms) after fixation onset on the word, the prime is replaced with the target. The paradigm is used to examine what kinds of primes facilitate reading.

In the moving window paradigm (McConkie & Rayner, 1975), only a specified area (e.g., 11 characters to the right and left) around the point of fixation is displayed normally while other parts are masked. The window moves in synchrony with the eyes. By varying the size of the window and the type of mask (e.g., X’s, visually similar or dissimilar characters) and comparing the reading times in the window and normal reading conditions, it is possible to define the size of the area from which a reader can efficiently extract and utilize information.

The boundary paradigm (Rayner, 1975) makes use of the saccadic suppression. Saccadic suppression means that during a saccade the intake of visual information is suspended and the reader is practically blind. If a change in the visual environment is made during a saccade or very soon after the eyes have landed (< 6 ms after the end of a saccade, McConkie & Loschky, 2002), the reader does not become consciously aware of it. The target word (“sentence” in the example of Table 7.1) is initially masked with a character string (“somkasoc”), and when the reader’s eyes cross an invisible boundary in the text, the mask is replaced with the actual target word. If the reader has extracted information from the target word preview prior to its change to the correct form, one should observe increased fixation time on the target word, even though the reader is not consciously aware of this. The size of the slow-down in eye fixation time, i.e. the difference between normal condition in which no change was made and a change condition is called the preview effect.

In the trailing mask paradigm (Schotter, Tran, & Rayner, 2014), previously fixated words are replaced with a mask as soon as the reader moves away from the word so that if the reader would return back to already read words, no useful information is available. Words to the right of the fixation point are presented normally.

Two paradigms have proven useful in examining foveal word processing: the disappearing text paradigm and the fast priming paradigm. In the disappearing text paradigm (see Box 2), a target word is initially presented normally. However, after a fixation lands on the word, the word disappears from screen. By varying the length of the delay of the disappearance, it is possible to infer how long visual exposure time is required for the word identification to proceed normally. In the fast priming paradigm, a prime stimulus is initially presented in place of the target word. After a short delay (e.g. 30 ms) from fixation onset on the word, the prime is replaced with the target word. In the example presented in Table 7.1, the prime comprises all the consonants of the word to examine whether consonants play a privileged role in early stages of word processing. If so, presenting the vowels first followed shortly by the consonants would delay the word recognition compared to the situation depicted in Table 7.1.

Two main paradigms used to study parafoveal information processing during reading are the text window paradigm and the boundary paradigm described in Box 2. In the text window paradigm, the reader sees only a certain amount of useful information around the point of fixation. The window moves in synchrony with the eyes, and by varying the size of the window, it is possible to estimate the size of the effective field of vision, or as it is often called, the size of a reader’s perceptual span. In the boundary paradigm (see Box 2), the target word is initially masked with a letter string. However, when the eye gaze crosses an invisible boundary placed in the end of the preceding word, the target word appears normally. By manipulating the type of preview it is possible to infer what type of information is extracted from the target word before it is actually fixated. For example, visual similarity is manipulated in the boundary paradigm by replacing letters in the parafovea with visually similar (e.g., k with h) or dissimilar letters (e.g., g with t). The main index is the parafoveal preview benefit, which assesses the amount of facilitation in processing gained by different kinds of parafoveal previews. The preview benefit is simply computed as the difference in fixation time between a full preview condition, in which the target word was presented normally, and the preview condition. Another measure to assess parafoveal processing is the so-called parafoveal-on-foveal effect (Kennedy, 2000). It measures the extent to which parafoveally available information affects fixation time on the previous word (Drieghe, 2011).

5 Eye Movements During Foveal and Parafoveal Processing of Words

In this section, we summarize what is known about foveal and parafoveal processing of single words, as revealed by readers’ eye movement patterns. Foveal word processing refers to cognitive processes carried out for the currently fixated word that falls onto the foveal vision. Parafoveal word processing, in turn, refers to processing done in the parafoveal region extending up to 5° of visual angle to the right and left of the current fixation. In what follows, we first discuss foveal word processing among adult readers, followed by a section focusing on young, developing readers.

5.1 Foveal Word Processing Among Competent Adult Readers

When readers fixate a word, fixation time spent on the word faithfully reflect cognitive processes needed to identify the word. The most frequently used eye fixation measure to tap into the foveal word processing is gaze duration, which sums up the durations of fixations made on the word when it is first encountered and before a saccade is launched away from the word (typically to the subsequent word). A robust and consistent finding has been the word frequency effect. Written words that appear infrequently in the language are read with longer gaze durations than words whose frequency is high (e.g., Inhoff & Rayner, 1986; Rayner & Duffy, 1986). The effect materializes either as longer durations of individual fixations made on the infrequent word or as an increased probability of making a refixation on the word, or both. An intriguing observation has been made by Liversedge et al. (2004) using the disappearing text paradigm (see Box 2; see also Rayner, Inhoff, Morrison, Slowiaczek, & Bertera, 1981). In their experiment, high- and low-frequency words were presented for foveal inspection during sentence reading for 60 ms, after which the word disappeared and the readers fixated on an empty space between two parafoveal words. Gaze durations on the empty space were longer when the empty space replaced a low-frequency word than a high-frequency word. This is compelling evidence for the view that fixation times reflect the mental processes ongoing during reading. The process of accessing a mental representation for a low-frequency word takes more time to complete than that for a high-frequency word. This mental process is reflected in the fixation time on the empty space even when the to-be-identified word is no longer visually present. The study also demonstrates that only a relatively short exposure time (60 ms or so) is enough to acquire sufficient visual input in the system for the word recognition process to proceed normally.

Another key finding is the word predictability effect: words that are highly predictable from the prior sentence or discourse context receive shorter fixation times than contextually unpredictable words (e.g., Balota, Pollatsek, & Rayner, 1985; Calvo & Meseguer, 2002; Hyönä, 1993; Rayner & Well, 1996). Also local, lexically-based predictability influences fixation time on words (Vainio, Hyönä, & Pajunen, 2009). When a verb strongly constrains the identity of the upcoming word(s) (e.g. “he hunched his back”), gaze duration is shorter on the highly constrained phrase (“his back”) than on the same phrase preceded by a less constraining verb (e.g., “he hurt his back”).

Alphabetic scripts are based on principles of converting spoken language codes to written counterparts. In orthographically completely transparent scripts there is a direct mapping between letters (i.e., graphemes) and sounds (i.e., phonemes). In other words, each grapheme corresponds to only one phoneme, and each phoneme is always represented by the same grapheme. A prime example of such script is Finnish where the phoneme-grapheme correspondence is practically 100%. Examples of less straightforward or more opaque alphabetic languages are English and Danish. An example of a phonologically opaque word in English is “choir”, for which rule-based grapheme to phoneme mapping would yield an output significantly different from the correct one.

When readers of alphabetic scripts process words in sentences, perhaps unsurprisingly the identification process entails a phonological recoding phase, which is reflected in fixation times in words. For example, Inhoff and Topolski (1994) found longer fixation times on words that had an irregular (e.g., “weird”) than regular (e.g., “mood”) spelling. The effect was short-lived, as it was obtained for the initial fixation made on the word but not for gaze duration (i.e., the summed duration of fixations made before fixating away from the word). This finding is evidence for the early activation of phonological codes during word recognition. Further evidence supporting early activation of phonological representations comes from studies where homophones were inserted in target sentences. Homophones are words that sound the same but are written differently (e.g., “bear” and “bare”). Rayner, Pollatsek, and Binder (1998) observed no difference in first fixation duration between the correct word form and its homophonic counterpart, despite the fact that the homophonic word was different in meaning. Signs of meaning activation were observed in gaze duration, which was longer for the homophonic than the correct word. Analogous results were observed for French by Sparrow and Miellet (2002) who found no difference in first fixation duration between correctly spelled words and homophonic non-words. Finally, Sereno and Rayner (2000) demonstrated that phonological recoding is more robust for infrequent than frequent words. They observed a phonological regularity effect for infrequent words but not for frequent words.

Written words are identified via the individual letters they contain. Thus, it is not surprising that the characteristics of letters and letter clusters are also reflected in fixation times on words. Word-external letters appear to be more relevant for successful word identification than word-internal letters. This became evident from the study of White, Johnson, Liversedge, and Rayner (2008). White et al. jumbled up letters both word-internally (e.g., “problem” vs. “probelm”) and word-externally (“problme” and “rpoblem”). The transposed-letter conditions produced longer fixation times than the correct condition. The transposed letter effect in fixation times was greater for the word-external than word-internal transpositions, indicating that letters in the beginning and end of a word are more crucial for word identification than letters in the middle of the word.

Another finding is that consonants play a more significant role in word recognition than vowels. This was demonstrated by Lee, Rayner, and Pollatsek (2001, 2002) using the so-called fast priming paradigm (Sereno & Rayner, 1992). In their version of the paradigm, when a word was fixated the presentation of one of the word’s letters, either a consonant or a vowel, was delayed for 30 or 60 ms. They found that delaying the presentation of a consonant for 30 ms resulted in significantly longer gaze durations on the word than delaying the presentation of a vowel. Such effect was not present in the 60-ms presentation condition. The pattern of results was interpreted to suggest that in the early stages of foveal word processing, consonants play a more significant role than vowels.

Where in the word the reader initially lands with his/her eyes also has a significant impact on eye behavior on words and on the word processing efficiency. If the eyes land in the word center, the reader is less likely to make a refixation on the word, and the gaze duration is much shorter than is the case when the eyes initially land in the word beginning or end. This was first demonstrated by O'Regan, Lévy-Schoen, Pynte, and Brugaillere (1984) for isolated words, but it was subsequently extended to reading words in sentences (e.g., Hyönä & Bertram, 2011; Nuthmann, Engbert, & Kliegl, 2005; Vitu, O’Regan, & Mittau, 1990; Vitu, McConkie, Kerr, & O’Regan, 2001). Thus, word center is the optimal viewing position (OVP) for smooth word processing. Presumably, this is due to all or most letters of the word falling on the fovea, at least when the word is relatively short. Fortunately, OVP is also close to the preferred landing position in reading (Rayner, 1979); that is, readers are likely to launch the first saccade into the word so that it lands close to the word center. The preferred landing position typically departs somewhat from the optimal viewing location by being a bit closer to the word beginning. A saccade to a word is launched on the basis of the length information extracted from the parafovea. Thus, when a word is short, the amplitude of the saccade into the word is shorter than is the case when programming a saccade to a long word (McConkie, Kerr, Reddix, & Zola, 1988).

The research has also identified an intriguing phenomenon that runs counter to the OVP effect. When readers make only one fixation on the word, this single fixation is longest when positioned in the word center and shortest when located toward word beginning or end. In other words, if only fixation is made on a word, word center is not the most optimal viewing position. This finding has been coined the inverted optimal viewing position (IOVP). Its exact nature is not yet known. It is at least partly explained by mislocated fixations that land toward the word beginning or end. The idea here is that the fixation at the word beginning is intended to land on the previous word and the fixation at the word end is intended to land on the subsequent word. Because a corrective saccade is quickly programmed, the fixation duration on these mislocated fixations is short. The relatively long duration of a single fixation that lands in the middle of the word may be also related to the amount of perceptual information that needs to processed: there is more visual information to be gleaned in the middle of the word than at the word edges (Vitu, Lancelin, & Marrier d’Unienville, 2007).

5.2 Foveal Word Processing Among Developing Readers

A significant part of the seminal work on eye movements in reading dealt with differences in eye movement patterns as a function of age and reading development (Buswell, 1922; Taylor, 1965). The seminal work demonstrated that less mature readers make more and longer fixations, shorter saccades and more regressive fixations that take the eyes leftward in text (see Fig. 7.3). More recent work (for a review, see Blythe & Joseph, 2011) has confirmed these global effects. These global differences between developing and mature readers also feature in word-level reading: younger and less experienced readers make more refixations on words and skip over words less frequently than older and more experienced readers. These developmental differences do not reflect differences in maturation of the oculomotor system, but are instead a reflection of differences in lexical processing efficiency. This becomes apparent, for example, in model simulations carried out by Reichle et al. (2013) and Mancheva et al. (2015). They applied the E-Z Reader model of eye movement control in reading (see Sect. 7.4) to account for developmental differences in readers’ eye movement patterns. Reichle et al. (2013) showed that the main difference between children’s and adults’ eye behavior in reading can be explained by overall lexical processing speed. In other words, the crucial difference between children and adults is that adults are faster in word identification. This conclusion was supported by Mancheva et al. (2015), who observed that the model parameter indexing lexical processing efficiency correlated strongly with children’s lexical skills as measured by offline tests, particularly with orthographic processing ability. Also other studies have shown that tasks that tap into linguistic processing efficiency are better predictors of eye fixation times on words than tasks that measure oculomotor efficiency (Huestegge, Radach, Corbic, & Huestegge, 2009).

Fig. 7.3
figure 3

The developmental pattern of some eye movement characteristics during reading from the first to sixth grade and in comparison to skilled adult readers (as reported in Rayner, Ardoin, & Binder, 2013)

Further proof for the claim that the oculomotor system is well developed among normally developing children when they begin to read comes from the study of McConkie et al. (1991). They showed that already first-grade elementary school children demonstrate the preferred landing position effect obtained among skilled readers (see also Joseph, Liversedge, Blythe, White, & Rayner, 2009). As described above, the phenomenon refers to the finding that the initial fixation in the word lands close to the word center, which is the optimal location for word recognition. Huestegge et al. (2009) found the initial fixation location to be nearer the word beginning among 2nd graders than 4th graders. They ascribed the difference in the 2nd graders’ tendency to read words with more than one fixation, whereas 4th graders read words more often with a single fixation. By applying such a refixation strategy, it makes sense for the 2nd graders to launch the initial fixation closer to the word beginning.

Given the findings demonstrating that children’s eye fixation patterns reflect their lexical processing efficiency, it is understandable that also children’s eye fixation patterns show effects of word frequency and word length. Hyönä and Olson (1995) found that these two factors interacted so that gaze durations on words were particularly long when the words were long and infrequent. Huestegge et al. (2009) in turn found the word frequency and word length effects to be smaller for 4th graders than 2nd graders. Similarly, Joseph et al. (2009) observed stronger word length effects in gaze duration and refixation probability for children than adults.

Studies using the disappearing text paradigm (Liversedge et al., 2004; Box 2) suggest that already very young readers are able to efficiently extract visual information. Blythe, Liversedge, Joseph, White, and Rayner (2009) found that as short as 40–75 ms presentation times were sufficient for reading to proceed normally even among 7-year-old children. A typical finding in studies employing the disappearing text paradigm is that readers stop making refixations on “words” (i.e., the empty space where the word was briefly presented). This is understandable, as there is no visual input, there is nothing to direct a second fixation to. Blythe, Häikiö, Bertram, Hyönä, and Liversedge (2010) used the disappearing text paradigm to study reading of short (4 letters) and longer (8 letters) words among 8–9 year-olds, 10–11 year-olds and adults. The study demonstrated that the youngest (8–9 year-old) children regressed back to the longer words in order to get another sample of the word (in the paradigm, the word reappears when a saccade is launched away from it). This effect was reduced for older children (10–11 year-olds) and absent for adults. The need for obtaining a second visual sample of longer words among young readers is interpreted to reflect younger readers’ smaller perceptual span (Häikiö, Bertram, Hyönä, & Niemi, 2009; Rayner, 1986).

5.3 Parafoveal Word Processing Among Competent Adult Readers

In addition to extracting text information from the foveated words, readers also glean useful information from the parafovea that extends 5° of visual angle to the right and left around the fixation point.

A key finding that has emerged from the text window (see Box 2) studies is that readers’ perceptual span is heavily biased to the right (when reading from left to right; McConkie & Rayner, 1976). The perceptual span extends up to 15 letters to the right of fixation, whereas the major determiner of the left boundary appears to be the beginning of the currently fixated word (Rayner, Well, & Pollatsek, 1980; Rayner, Well, Pollatsek, & Bertera, 1982). Recent work has slightly modified the latter conclusion by showing that the leftward span can extend to the previous word when the reader’s attention is not fully disengaged from the word to the left (i.e., processing of Word N − 1 is not complete) of the currently fixated word (Apel, Henderson, & Ferreira, 2012; Binder, Pollatsek, & Rayner, 1999; McGowan, 2015). This rightward asymmetry in perceptual span is an attentional effect and not related to the brain’s hemispheric specialization. This has become apparent from studies conducted in Hebrew and Arabic, which are read from right to left. It has been found that the perceptual span of Hebrew and Arabic readers is asymmetric (greater) to the left, that is, toward the reading direction (Jordan et al., 2014; Pollatsek, Bolozky, Well, & Rayner, 1981).

A number of studies have examined what type of information readers glean from the parafovea to the right of fixation (when reading from left to right). The primary method in these studies has been the boundary paradigm (see Box 2). A meta-analysis of studies using the boundary paradigm is reported by Vasilev and Angele (2017). These studies have shown that word length information is extracted up to 15 letters to the right of fixation (McConkie & Rayner, 1975). When correct length information is provided of the words in the parafovea, reading is sped up in comparison to the situation when the parafoveal word length information is incorrect (Inhoff, Radach, Eiter, & Juhasz, 2003; White, Rayner, & Liversedge, 2005). Length information is also utilized in saccadic programming. The incoming saccade to the following word is longer when the word is also longer (e.g., McConkie et al., 1988). As discussed above, parafoveal word length information is utilized to position the eyes toward the word center to optimize word recognition.

Orthographic information is also picked up from the parafovea. The overall visual shape is perceived of parafoveal words (e.g., McConkie & Rayner, 1975). This becomes apparent in the fixation times being shorter when the parafoveal previews are visually similar than dissimilar to the correct word. Also letter identity information is processed parafoveally. Using the moving window technique (see Box 2), Häikiö, Bertram, Hyönä, and Niemi (2009) demonstrated that the letter identity span extends up to 9 letters to the right of fixation. The conclusion is based on a comparison between the normal condition (no window) and the condition where outside the window around the fixation letters were replaced with visually similar letters preserving their overall visual shape but changing their identity. By applying the boundary paradigm (see Box 2), Johnson et al. (2007) used parafoveal previews in which the adjacent letters were transposed either word-externally (e.g., leement or elemetn instead of element) or word-internally (elemnet). Using 7-letter words as their target stimuli, they showed that transposition led to increased gaze durations on the target word if they were preceded by transposed-letter previews affecting the word-external letters, but not the word-internal letters. This led Johnson et al. conclude that “readers are able to extract information from the first five letters of the word to the right of fixation plus the word-final letter” (p. 222). The conclusion that more parafoveal orthographic information is extracted from word-external than word-internal letters is also supported by the study of Briihl and Inhoff (1995). This is presumably due to word-internal letters suffering more from visual crowding than word-external letters.

Parafoveally available orthographic information can also affect saccadic programming. Hyönä (1995a) found the initial saccade into the word to land closer to the word beginning if there was a highly infrequent letter cluster in the word beginning, compared to the word beginning hosting a frequent letter combination (see also Radach, Inhoff, & Heller, 2004; White & Liversedge, 2006). Similarly, White and Liversedge (2004) found the initial fixation to land closer to the word beginning if the word beginning contained a misspelling. Moreover, when a word contained a misspelling in the beginning, a regression was frequently launched toward the word beginning after an initial fixation on the word.

There is ample evidence that phonological information is extracted from the word appearing in the parafovea. Evidence comes from boundary paradigm (see Box 2) studies with different kinds of phonological preview manipulations. Pollatsek, Lesch, Morris and Rayner (1992) and Miellet and Sparrow (2004) presented homophonic previews of target words (e.g., target word site was parafoveally previewed as cite), which were found to facilitate subsequent foveal processing of the target words. Chace, Rayner, and Well (2005) obtained a parafoveal homophone effect for skilled readers but not for less skilled readers. Henderson, Dixon, Petersen, Twilley, and Ferreira (1995) manipulated phonological regularity in the word beginning and observed preview words with phonologically regular initial trigrams to benefit subsequent foveal processing more than preview words with irregular initial trigrams. Finally, Ashby, Treiman, Kessler, and Rayner (2006) manipulated phonological vowel concordance (cherg -> chirp vs. chord -> chirp) between the preview string (non-word) and the target word. Concordant previews led to shorter gaze durations on the target words than discordant previews, which provides further converging evidence for parafoveal phonological processing.

There has been a long-standing debate about whether readers can parafoveally extract lexical-semantic information. One reason for the sometimes heated debate is that its presence or absence has important implications to the competing theoretical models of eye guidance during reading. As explained in Sect. 4, a key difference between these models is the extent to which word identification is assumed to be serial versus parallel (Engbert & Kliegl, 2011; Reichle, 2011). Evidence for parafoveal lexical-semantic effects speaks for parallel identification of more than one word. Thus, it may be used as evidence for parallel models and against serial models.

Evidence for parafoveal lexical-semantic effects has been sought via parafoveal-on-foveal effects by manipulating the frequency or semantic plausibility of Word N + 1 and measuring their effects on fixation time on Word N. The evidence has been mixed (Drieghe, 2011; Hyönä, 2011). Also the boundary paradigm (see Box 2) has been applied by manipulating the semantic relatedness of parafoveal previews to the intended word. Earlier evidence primarily spoke for an absence of parafoveal semantic effects (Rayner, White, Kambe, Miller, & Liversedge, 2003). More recently, however, Hohenstein and Kliegl (2014) observed a parafoveal semantic effect using the standard boundary paradigm (i.e., semantically related and unrelated previews were parafoveally available for the entire time Word N − 1 was fixated). Hohenstein, Laubrock, and Kliegl (2010) observed fast parafoveal semantic priming using a modified boundary paradigm where the target word was initially replaced with random string of consonants. When Word N − 1 was fixated, the random letter string was first replaced with a semantically related or unrelated word for a variable amount of time (20–125 ms, depending on the experiment), followed by the target word. They found gaze duration on the target word to be shorter in the semantically related than unrelated condition, when the parafoveal preview was present for the first 125 ms of the fixation on Word N − 1 (or for only 80 ms when the parafoveal preview appeared in boldface). The results were taken as evidence in favor of parallel word processing in reading. Finally, Schotter (2013) observed a parafoveal semantic preview effect for synonyms (e.g., curlers was replaced with rollers) but not for semantic associates (e.g., curlers was replaced with styling). The study suggests that similarity in meaning between the target word and its parafoveal preview influences the degree to which parafoveal semantic effects may be observed. Moreover, Schotter speculates that it may be easier to find parafoveal semantic effects in languages with regular phoneme-grapheme correspondence rules, such as German (Hoehenstein & Kliegl, 2014; Hohenstein et al., 2010), than in less regular languages, such as English. The idea is that as foveal word processing is made relatively easy in regular languages, more attentional resources may be devoted to parafoveal word processing.

5.4 Parafoveal Word Processing Among Developing Readers

Recently, there has been an increased interest in studying parafoveal processing among developing readers. The seminal study of Rayner (1986) demonstrated that the readers’ perceptual span develops as a function of reading ability. The perceptual span of 6th grade children was observed to be analogous to that of adults, while the perceptual span of 2nd and 4th grade children was smaller. Häikiö et al. (2009) replicated these developmental trends. Rayner demonstrated that the perceptual span for word length information extends to 11 letters to the right of fixation for 2nd and 4th graders, and it grows up to 14–15 letters among 6th graders and adults. The perceptual span for global letter feature information is somewhat narrower, extending to 7 letters for 2nd graders and to 11–12 letters from the 4th grade onwards. Häikiö et al. studied the perceptual span for letter identity information and found it to grow from 5 letters to the right for 2nd graders to 7 letters among 4th graders and 9 letters among 6th graders and adults. In a moving window study, Sperlich, Schad, and Laubrock (2015) demonstrated that the growth of perceptual span (for letter feature information) in the first stages of the development of reading skill during Grades 1–3 takes place between the second and third school year, whereas little growth is visible between Grades 1 and 2.

Regarding parafoveal word processing, there appears to be little differences between developing and mature readers. The study of Häikiö, Bertram, and Hyönä (2010) was one of the first where the boundary paradigm (see Box 2) was applied to the study of developing readers. Studies with adult readers had demonstrated that readers extend their attentional span more strongly across a spatially unified letter cluster (i.e., an unspaced compound word such as basketball) than across a linguistic unit that comprises two words (i.e., spaced compound words such as tennis ball) (Hyönä, Bertram, & Pollatsek, 2004; Juhasz, Pollatsek, Hyönä, Drieghe, & Rayner, 2009). Häikiö et al. (2010) extended these results also to developing readers: Surprisingly, despite the fact that 2nd grade readers’ perceptual span is significantly smaller than that of adult readers, even they displayed the same effect.

The results of Tiffin-Richards and Schroeder (2015) suggest a developmental shift in parafoveal processing when reading an orthographically regular language (German). As reading skill develops, readers shift from using parafoveal phonological information to using parafoveal orthographic information. By applying the boundary technique (see Box 2), Tiffin-Richards and Schroeder compared children’s and adults’ parafoveal processing of phonological and orthographic information in German. Their main finding was that children but not adults showed parafoveal phonological effects. In contrast, adults demonstrated effects indexing orthographic parafoveal processing, while children showed these effects only under specific conditions.

6 Eye Movements During Sentence Comprehension

Identification of individual words is not enough for successful reading comprehension, but the successive words need to be integrated to understand the meaning of a whole clause and sentence. This process takes place incrementally, as readers form meanings of successive words as they move forward in text. Yet, eye-tracking studies have shown that readers pause for longer time at sentence (and also clause) boundaries presumably to integrate the sentence meaning (Just & Carpenter, 1980; Rayner, Kambe, & Duffy, 2000; White, Warren, & Reichle, 2011). This phenomenon is called the sentence wrap-up effect. Readers do not proceed to the following sentence until they have secured a sufficient understanding of the currently read sentence (or clause). Integrative processing at sentence boundaries may also be reflected in regressive fixations launched to earlier parts of the sentence (Hyönä, 1995b; Kaakinen & Hyönä, 2007) or sometimes to an earlier sentence. For example, when readers process a text with unfamiliar text contents, they are more likely to initiate a regression to an earlier part of the sentence, particularly when the sentence contains information pertinent to their reading goal (Kaakinen & Hyönä, 2007). By regressing back in text, readers provide themselves with another opportunity to visually sample a text region they might find difficult to understand and/or important to form a good grasp of its meaning.

Successful sentence comprehension also requires that syntactic relations between words are sorted out in order to achieve a correct interpretation of the sentence. To do so, the reader needs to parse the syntactic structure of the sentence. Syntactic parsing entails that the reader identifies the actor of the action depicted by the main verb, whom the action is directed to, where the action takes place, etc. Syntactic information is conveyed in sentences, for example, by word order, morphological case marking, verb argument structure (whether a verb is transitive or intransitive), and animacy of the depicted entities (whether or not they refer to animate entities capable of initiating the action depicted by the verb). These processes are reflected in readers’ eye movements (Clifton, Staub, & Rayner, 2007; Clifton & Staub, 2011).

That syntactic parsing is typically incremental in nature is nicely demonstrated by the so-called garden-path effect (Frazier & Rayner, 1982). The effect reflects a misparse of a sentence that is syntactically locally ambiguous. One such sentence in English is “Since Jay always jogs a mile seems like a long distance to him”. When reading a sentence like this, the readers are “led down the garden path”, as they typically attach “a mile” to the first clause as the sentence object (“Since Jay always jogs a mile”). However, that is not the intended meaning; instead “a mile” should be considered the syntactic subject of the second clause. Readers realize this when they fixate the word “seems”, which is fixated for a long time and is frequently followed by a regression to an earlier part of the sentence (Mitchell, Shen, Green, & Hogdson, 2008) and a series of rereading fixations made in the service of correcting the initial misparse. This pattern of results is taken as evidence for the so-called late closure principle of the garden path theory (Frazier & Rayner, 1982). Apart from syntactic ambiguity, previous research has examined effects of syntactic complexity, syntactic prediction and syntactic violations on readers’ eye movement patterns (Clifton & Staub, 2011).

As the eye movement record provides a real-time protocol of processing as it evolves through time, the eye movement data has been used to tease apart the time course of sentence parsing. The relative degree of delay in the observed effects has consequences to the theories of sentence parsing. However, there is no uniform pattern in the timing of effects, with some researchers finding an early syntactic effect obtained in the duration of first fixation made on the critical text region, while other researchers observe only delayed effects, for example in the probability of regression or in the fixation time on the region following the critical region (for an extensive summary of these studies, see Clifton et al., 2007).

7 Eye Movements During Text Reading

Understanding a single sentence is not sufficient, but readers also need to integrate the meaning of successive sentences to construct the meaning of whole text paragraphs and even larger text segments. Reading a text consisting of a full page (or several pages) of text means that a reader has to navigate through several lines of text. This is quite different from reading a single sentence that typically extends through only one or two lines; thus, certain eye movement patterns are typical for reading a text page. Moreover, the increased cognitive demands of understanding a text instead of single sentences are reflected in eye movements. In the following, we will outline the typical characteristics of eye movement patterns related to text reading.

As the reader navigates through the lines of text, the eyes move from the beginning of each line of text to the end of it, and then to the beginning of new line. A return sweep refers to an eye movement initiated from the end of a line towards the beginning of a new line. This is such a long saccade that it often is inaccurate, typically undershooting the target in the beginning of the new line. Thus, the reader makes a corrective eye movement, and typically there is an “extra”, short fixation close to the beginning of the new line of text.

Research suggests that readers do not extract meaningful information, such as word meanings, from the lines below the currently fixated line of text (Pollatsek, Raney, Lagasse, & Rayner, 1993). Occasionally words can be identified one or two lines below the currently fixated line, but three lines down is already too far for word identification to occur. However, even though word identification from more than two lines of text down is not likely, readers do extract information about the layout of the text page. Evidence for this comes from studies conducted by Cauchard, Eyrolle, Cellier, and Hyönä (2010a, b). Cauchard et al. (2010a) examined whether reading is affected by a text window (see Box 2) that restricts the visibility of a text page to the fixated line and two lines above and below the fixated line. The text itself contained organizational signals, such as subheadings and paragraph breaks, typical for expository texts; such signals are used to cue the content structure of the text. It was found that comprehension was poorer in the text window condition in comparison to the normal reading condition. Readers also tended to make more regressions to headings in the normal than in the window condition. These results suggest that readers make use of the page layout information available in the peripheral vision during reading in order to guide long-distance look-backs, which in turn may be crucial for comprehension. In another study using a similar window paradigm (Cauchard et al., 2010b), participants were asked to look for answers to specific questions in text. Readers displayed longer search times, more and longer eye fixations and shorter saccades if they were denied a preview of organizational signals such as headings and paragraph marks in the periphery, indicating that readers do make use of these cues when they navigate through text.

As noted above, it is not enough to identify the individual words and parse the sentence structure, but the information conveyed by the sentences should also be integrated into a coherent memory representation (see, e.g., Kintsch, 1998). These more global integrative processes influence the eye movement patterns already at the word level. Gaze durations on words are shorter during reading of passages than single unrelated sentences (Radach, Huestegge, & Reilly, 2008). Total fixation time on the words, on the other hand, is longer during passage reading than sentence reading. These results indicate that presenting words in a text facilitates the initial encoding of words but increases the need to reread words in text. Increased rereading rate presumably reflects integrative processing at the text or paragraph level. Moreover, saccade amplitudes are greater and saccades land further into the words during passage than sentence reading. In sum, it seems that both the temporal and spatial aspects of eye movements differ between sentence and text reading.

As mentioned earlier, at sentence boundaries readers integrate the information presented in the sentence before they move on to the next sentence, producing increased fixation times on the last word of a sentence (Rayner, Kambe, & Duffy, 2000). If there is a greater need to obtain a well-integrated memory representation of the text, wrap-up times are increased. For example, when a reader encounters text information that is highly pertinent to the reader’s reading goal, wrap-up times are longer than when the sentence contains information that is not relevant to the reader (Kaakinen & Hyönä, 2007).

Problems in integrating sentence information to the evolving memory representation of the text are also reflected in sentence wrap-up times. For example, when reading an ambiguous text passage without a title that would indicate what the passage is all about, wrap-up times are increased in comparison to a condition where the title is given before reading (Wiley & Rayner, 2000). When reading a passage describing, for instance, a space trip to the moon, comprehending what it is all about is more difficult if the reader does not know the topic of the text in advance. This is reflected in the eye movement patterns as increased gaze durations on words, more regressive eye movements, and longer sentence wrap-up times.

During reading of longer texts, readers may make regressive eye movements, also called look-backs, to previously read parts of text in order to integrate text information to memory. Instead of being a signature of inefficient reading, as is often believed, regressions to earlier parts of text seem to be fundamental for successful comprehension. Using the trailing mask paradigm (see Box 2), Schotter, Tran, and Rayner (2014) demonstrated that denying readers the opportunity to resample words was harmful for sentence comprehension. Moreover, previous studies suggest that readers who make look-backs to informative parts of the text gain better memory of the information presented in the text than readers who do not make look-backs. This was demonstrated in the study of Hyönä, Lorch, and Kaakinen (2002), where adult readers read expository texts that followed a typical expository text structure and contained subheadings, which marked the topic of the following paragraph. Readers who tended to look back to headings, typically from the end of the paragraph, gained good memory of the information presented in the text. Selective look-backs to informative parts of the text are most likely strategic in nature, meaning that they reflect a conscious decision to reread parts of text that the reader believes will help in constructing a good memory representation of the text (Hyönä & Nurminen, 2006). On the other hand, readers who unselectively reread parts of text showed poorer memory for text information (Hyönä et al., 2002).

Look-backs may also be triggered by problems in integrating text information to the previous text context. An ironic statement is an example of a situation in which an utterance does not literally and directly fit into the context in which it is presented. Consider a phrase “What a great weather for a picnic!”. If this sentence is presented in a passage describing a rainy and windy day, hence carrying an ironic meaning, readers make more look-backs to it than if it is presented in a context describing a beautiful sunny day (Kaakinen, Olkoniemi, Kinnari, & Hyönä, 2014). Look-backs in this case are assumed to reflect attempts to resolve the incongruence between the literal meaning of the phrase and the context in which it is presented.

7.1 Task Effects in Text Reading

Readers often have a specific goal in mind when reading longer texts, such as reading in order to learn new information on a topic or looking for certain type of information. Previous research shows that adult readers adjust their intake of text information to meet the demands of the reading task (e.g., Heller, 1982; Laycock, 1955; Radach et al., 2008; Rothkoph & Billington, 1979; Shebilske & Fisher, 1983; Wotschack & Kliegl, 2013). For example, when expecting difficult rather than easy questions after reading, readers make shorter saccades and fixate more function words (prepositions, articles, etc.) in text (Wotschack & Kliegl, 2013). Moreover, the reading task may influence the local processing of the text information such that readers show different eye movement patterns in different regions of the same text. Rothkoph and Billington (1979) showed that when looking for answers to specific questions, readers made more and longer eye fixations in sentences that contained question-relevant than question-irrelevant information. In other words, the reading task may not only influence the global processing of text information by inducing a more careful reading strategy (e.g., Wotschask & Kliegl, 2013) but also the local processing of text information can be affected by increasing the amount of time spent on specific parts of the text (e.g., Rothkopf & Billington, 1979).

Readers are sensitive to the goal relevance of text information and tend to selectively attend to information that is pertinent to their goal (Kaakinen & Hyönä, 2007, 2008, 2014; Kaakinen, Hyönä, & Keenan, 2003). Kaakinen et al. (2003) asked adult participants to read two expository texts describing various diseases so that they could explain critical facts about one of the diseases described in each text to somebody else. The instructions thus made information related to one disease highly relevant to the readers, whereas other information presented in the text could be considered irrelevant. The results of the study showed that the relevance effect, that is, the difference in fixation time between reading the sentences as task-relevant or task-irrelevant, emerged early in processing, as revealed by the progressive fixation time during first-pass reading of the sentence (for eye movement measures used in text comprehension studies, see Hyönä, Lorch, & Rinck, 2004). This means that readers reacted to task relevance as soon as they could identify text information as relevant (or irrelevant). The relevance effect was also observed in later look-back time, indicating that readers also later reread relevant sentences more than irrelevant sentences.

In another study examining relevance effects at the level of individual words (Kaakinen & Hyönä, 2007), it was observed that these effects appear already during the initial processing of words. Gaze durations for words (i.e., the time spent fixating on a word before proceeding to the next word) were longer in task-relevant than in task-irrelevant sentences, even for words at the beginning of the target sentences. Moreover, readers tended to skip over more words within irrelevant than relevant sentences.

Moreover, it was found that task relevance influences the magnitude of the parafoveal preview effect, suggesting that readers’ attentional span is zoomed into the currently fixated word when reading relevant information (Kaakinen & Hyönä, 2014). This result suggests that the attentional requirements of encoding relevant information to memory restricts the amount of information the reader can process during one fixation.

However, there are individual differences in how well readers can adjust their eye movements to meet the task demands. In a seminal study, Laycock (1955) investigated individual differences in the ability to “read as fast as possible without missing important points”. The results showed that while all readers were able to speed up reading without showing detrimental effects on comprehension, a group of more flexible readers differed from less flexible readers particularly in the rate and number of fixations during speeded reading: they were faster and made fewer fixations. Laycock concluded that some readers are better able to control their eye movements and possibly to increase their attentional span according to the task demands than others.

Previous research suggests that individual differences in working memory capacity (WMC) also play a role in how well readers adjust their text intake to the task at hand (Kaakinen, Hyönä, & Keenan, 2002, 2003). It seems that high-WMC readers are more effective than low-WMC readers in guiding their attention selectively to task-relevant text information and away from task-irrelevant information. In these studies, for high-WMC readers the relevance effect emerged already during the first-pass reading of sentences, while for the low-WMC readers the effect was only observed later in the look-back fixation time (i.e., the time spent rereading the target sentences after first fixating on a subsequent sentence).

Moreover, WMC is related to the ease with which task-relevant information is encoded to memory when readers have relevant prior knowledge about the text contents. Kaakinen et al. (2003) asked participants to read a text describing familiar diseases and another text describing unfamiliar diseases with the instructions to pay special attention to one of the diseases described in the text. In addition to tracking the readers’ eye movements during reading, a free recall was collected after reading. The results showed that when reading the text describing diseases readers had ample prior knowledge of, high-WMC readers did not show longer eye fixation time on relevant than on irrelevant text segments, even though they showed better recall of relevant than irrelevant text information. Low-WMC readers, on the other hand, demonstrated a relevance effect in the eye fixation times as well as in the text recall. This pattern of results suggests that high-WMC readers can efficiently encode task-relevant information to memory when the encoded information is familiar to them.

7.2 Developmental Trends in Text Reading

Relatively little is known about developmental trends in eye movements during reading of longer passages of text. However, the existing studies show that eye tracking has great potential for revealing developmental trends in reading comprehension skill.

The study by van der Schoot, Vasbinder, Horsley and van Lieshout (2008) suggests that 10–12 year-old children are sensitive to the relevance of text information. In the study, participants were asked to read a story either from a science or gossip journalist’s perspective. The stories used in the experiment contained both science-related information as well as descriptions of the social relationships between story characters. When told to pretend to be science journalists children showed longer eye fixation times on science-related than on gossip-related words in the story, whereas when reading from the gossip journalist’s perspective, children showed longer eye fixation times on gossip-relevant information. The relevance effect was observed already in the first-pass fixation times on words. Moreover, children’s comprehension ability (as measured by an independent test) was positively correlated with increased time spent on task-relevant text information. After controlling for the effects of word decoding skill and vocabulary, the time spent regressing back to the task-relevant words was positively correlated with comprehension ability.

In the study of Kaakinen, Lehtola, and Paattilammi (2015), groups of 2nd (8–9 years), 4th (10–11 years) and 6th graders (12–13 years) read age-appropriate science textbook materials either in order to answer a “why” question presented before reading, or for general comprehension. The results showed that already 2nd graders adjusted their text scanning patterns to the task at hand by showing slower first-pass fixation times when preparing to answer a why-question than when reading for comprehension. In older age groups the effect of the reading task was seen as increased look-back times within the text, i.e., readers did more look-backs when reading to answer why-questions than when reading for comprehension. It is surprising to find that already 2nd graders adjusted their reading behavior in response to task instructions. Perhaps this is made possible by Finnish readers being relatively skillful word decoders already at young age due to Finnish having completely regular letter-sounds correspondence rules (Seymour et al., 2003). It would be interesting to study these effects among children reading less regular orthographies.

What seems to be differentiating between developing good and poor comprehenders is the ability to strategically look back to task-relevant information. Van den Broek, White, Kendeou, and Carlson (2009) used eye-tracking to examine reading strategies of successful and struggling young readers (4th, 7th and 9th graders). They found that the groups differed with respect to where in text look-backs were directed: readers with good comprehension skills reread specific, informative parts of text, whereas struggling readers reread text unselectively. A study by van der Schoot, Reijntjes, and van Lieshout (2012) suggests that young readers (10–12 year-olds) with good comprehension skill are more likely than poor comprehenders to look back to inconsistencies in text. In their study, participants read stories describing a character (e.g., “Mary is a vegetarian” or “Mary is a fast-food addict”). The story character’s actions (e.g., “Mary ordered cheeseburger and fries”) described in the passage were either consistent or inconsistent with the character description, and the distance between the action and the character description was manipulated. Both good and poor comprehenders were sensitive to inconsistencies when the inconsistent character action directly followed the character description. However, when the description and action were separated by several sentences, only readers with good comprehension skills tended make regressions to the contradictory information. This reflects their more integrated representation of the entire text.

8 Outlook

As it has become apparent from the present chapter, the application of the eye-tracking technique to the study of reading has been a success story. We also see a bright future ahead of us. We see five research avenues that are likely to gain momentum in the future. First, developmental eye movement studies are likely to significantly increase our understanding of the acquisition and development of reading skills (see Schröder et al., 2015). Second, a lot can be learned from cross-linguistic studies where reading processes are directly compared across different languages and orthographies (see Liversedge et al., 2015). Third, eye-tracking has not been fully exploited in investigating text comprehension processes. Thus, we expect more eye movement studies to appear on reading longer texts. Fourth, eye-tracking studies on reading texts presented via electronic media (Hyönä, 2010) are highly likely to gain popularity simply for the mere reason that a lot of reading is nowadays done via electronically available texts. Moreover, linear reading of printed books is increasingly complemented by non-linear reading, for example, of hypertexts. This is likely to change some key aspects of reading related to higher-order comprehension processes. Finally, the combination of eye-tracking data with other data reflecting moment-to-moment processing (e.g., EEG, fMRI, motion capture, psychophysiological measures) will further advance our understanding of the reading process. All in all, we feel confident that a lot can still be learned from applying the eye-tracking method to study reading as it takes place in different orthographies and reading environments.

9 Suggested Readings

  • Rayner, K., Pollatsek, A., Ashby, J., & Clifton, C. Jr. (2012). Psychology of reading (second edition). Hove, UK: Psychology Press.

  • This monograph provides a comprehensive coverage of the cognitive processes involved in reading, including chapters on eye movements in reading.

  • Liversedge, S.P., Gilchrist, I.D., & Everling, S. (2011). The Oxford handbook of eye movements. Oxford, UK: Oxford University Press.

  • This edited volume contains two sections relevant to eye movements in reading (Part 6: Eye movement control during reading; Part 7: Language processing and eye movements). Relevant chapters of this volume are referred to in the text.

  • Reichle, E. D. (in press). Computational models of reading: A handbook. Oxford, UK: Oxford University Press.

  • This monograph provides a comprehensive coverage of all major theories of reading.

  • Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422.

  • This review article provides a comprehensive summary of the eye movement research done in reading up to late 1990s.

  • Rayner, K., Ardoin, S. P., & Binder, K. S. (2013). Children’s eye movements in reading: A commentary. School Psychology Review, 42, 223–233.

  • This is an introduction to a special issue on children’s eye movements during reading.

  • Schröder, S., Hyönä, J., & Liversedge, S.P. (2015). Developmental eye-tracking research in reading: Introduction to the Special Issue. Journal of Cognitive Psychology, 27, 500–510.

  • This Special Issue contains original research articles dealing with developmental aspects of eye movements in reading. The emphasis is on young, developing readers.

  • Blythe, H. I. (2014). Developmental changes in eye movements and visual information encoding associated with learning to read. Current Directions in Psychological Science, 23, 201–207.

  • This article reviews the literature on developmental changes in eye movements during reading.

  • Staub, A., & Rayner, K. (2007). Eye movements and on-line comprehension processes. In G. Gaskell (Ed.), The Oxford handbook of psycholinguistics (pp. 327–342). Oxford, UK: Oxford University Press.

  • This chapter summarizes eye movement studies conducted on syntactic parsing during reading.

  • Hyönä, J., Lorch, R.F.Jr., Rinck, M. (2003). Eye movement measures to study global text processing. In J. Hyönä, R. Radach & H. Deubel (Eds.), The mind’s eye: Cognitive and applied aspects of eye movement research (pp. 313–334). Amsterdam: Elsevier.

  • The chapter introduces eye movement measures to study the processing of long texts presented as multiple-line screens.

10 Questions Students Should Be Able to Answer

  1. (1)

    Why is eye-tracking a useful tool to study reading? Mention at least three reasons.

  2. (2)

    Is speedreading feasible based on what you have read in this chapter? Why do you think that way?

  3. (3)

    How do eye movement patterns differ during reading of single sentences and longer text paragraphs? Why?

  4. (4)

    What have eye movement studies revealed about the development of reading skill?