Keywords

Introduction

Imagine a classroom situation: some students seem to be listening attentively, whereas others appear to stare blankly into space or doodle away in their notebooks. This scenario illustrates the spectrum of mental states that students could experience during a learning situation, ranging from being strongly engaged with the lecture content to focusing on other things entirely, or not thinking at all. The shift in attention away from a task to self-generated, task-unrelated thoughts is also known as mind wandering and has been shown to influence learning in a mostly negative way (e.g., Randall et al., 2014). For instance, mind wandering during virtual lectures (Faber et al., 2020; Hutt et al., 2017) and face-to-face lectures (Wammes et al., 2016) is related to worse performance on a quiz. These adverse effects highlight the importance of establishing when, why, and how mind wandering arises to be able to potentially alleviate its negative effects on learning.

However, mind wandering is notoriously difficult to measure. There are several reasons for this. First, the measurement of mind wandering often relies on self-reporting, which is inherently subjective and prone to error, due to biases pertaining to the demand characteristics of an experiment or evaluation apprehension (e.g., the Hawthorne effect; Smallwood & Schooler, 2015). In addition, the act of self-reporting might have limited ecological validity and disrupt the natural flow of a task or process. Despite these downsides to self-reports, they do provide valuable insights into variations across tasks and people and open up the possibility to identify behavioral and neural correlates of mind wandering that could potentially be used as a more objective measure.

Second, it is unlikely that there is one set of behaviors and/or neural signatures of mind wandering that generalizes across all tasks and situations. For instance, zoning out during driving might manifest as a “tunnel vision” on the road ahead (He et al., 2011), whereas mind wandering during a boring vigilance task might cause a person to look away from a central point of fixation (Faber et al., 2020). The neurophysiological correlates of mind wandering might therefore depend on task affordances, such as what “normal,” on-task behavior looks like and what kind of processing the task requires. These idiosyncrasies are important to take into consideration in the development of measures for identifying mind wandering in the classroom, which is a highly heterogeneous context.

In this chapter, we will provide an overview of subjective and objective measures of mind wandering, their applications, and their current limitations, and we will discuss implications for measuring mind wandering in educational contexts. We will argue that by triangulating subjective self-reports with indirect behavioral and neurophysiological measures, it is possible to arrive at more comprehensive measures of mind wandering.

Measuring Mind Wandering

Subjective Measures

Perhaps the most straightforward method for measuring mind wandering is to directly ask people about the content and unfolding of their thoughts. This can be accomplished through questionnaires, online self-reports, and offline self-reports. Questionnaires include measures that either tap into mind wandering as a trait (e.g., overall self-generated thought tendencies) or as a state (e.g., how much a person thinks they mind wandered in a specific situation). Trait-level measures, such as the Imaginal Process Inventory (IPI; Singer & Antrobus, 1972), the Mind Wandering-Deliberate and Mind Wandering-Spontaneous questionnaire (Seli et al., 2016), or the Mind Wandering Inventory (Gonçalves et al., 2020), capture stable self-generated thought tendencies. However, trait-level measures are not always reliable predictors of task-related behavior: a previous work has shown a discrepancy between mind wandering proneness scores (trait-level) and self-reported online mind wandering measures and eye gaze-based measures during reading (Faber et al., 2018a). It is possible that this lack of convergence is due to inaccurate self-appraisal or other biases. However, it could also reflect a meaningful distinction, such as a discrepancy between the experience of mind wandering during everyday life and during a cognitively demanding task, or a distinction between being able to report a gist-level measure of mind wandering versus having awareness of and/or access to individual mind wandering thoughts (Dias da Silva et al., 2020).

There are several measures that aim to tap into state-level processes. Retrospective questionnaires typically are designed to have participants characterize the average content (Seibert & Ellis, 1991) or frequency (Matthews et al., 1999; Smallwood et al., 2004) of thought during a preceding period but are prone to memory-related errors or omissions (Ellamil et al., 2016). Online reports involve intermittently asking individuals about the contents of their thought in real time. These questions, referred to as probes, are used to track the contents of thought during resting state, during performance of an experiment, or in everyday life using smartphone applications, for example. An alternative to probing participants during a task is to ask individuals to report whenever they catch their minds wandering (Smallwood & Schooler, 2015). However, most individuals’ ability to catch their mind in flight and to report on the mental processes and dynamics that give rise to thought content is generally considered to be poor (Ellamil et al., 2016). There are some individuals, however, such as experienced meditators, who have high levels of meta-awareness. These individuals are capable of catching their mind wandering episodes in flight with high temporal precision. Therefore, an alternative experience sampling approach to tapping into the dynamics of mind wandering is to collect self-reports from these individuals while they undergo a brain scan (e.g., fMRI), a method called neurophenomenology (Ellamil et al., 2016).

Online reports appear to be the best method to date. Moreover, they are less prone to memory and self-serving biases, which could influence both retrospective and trait-questionnaire reports. In addition, they can yield the richest data, as they enable a large number of distinctive thought reports that can reveal corresponding distinctive neural, physiological, and behavioral correlates. Nevertheless, experience sampling approaches alone cannot capture moment-to-moment fluctuations between states of mind wandering and focused attention. Therefore, it is important to triangulate different direct measures of mind wandering with indirect behavioral measures such as accuracy and neurophysiological measures across tasks (Smallwood & Schooler, 2015).

Objective Measures

As outlined above, the phenomenon of mind wandering has been studied extensively over the last decade. However, most studies suffer from inherent limitations imposed by their reliance on subjective measures of mind wandering.

These subjective self-reports are critically dependent on meta-awareness, which is the explicit awareness of the content of thought (Schooler et al., 2011; Smallwood & Schooler, 2006). To alleviate these issues, attempts have been made to measure mind wandering by triangulating self-reports, behavioral measures, and neurocognitive measures (e.g., Faber et al., 2018a; Mittner et al., 2014; Smallwood & Schooler, 2015).

Eye Tracking Findings

One particularly promising avenue is the use of eye movement to detect mind wandering: eye tracking is cheap, noninvasive, minimally intrusive, and scalable to naturalistic settings such as classrooms (Bixler & D’Mello, 2016). Eye movement recordings tap into what is known as the “eye-mind link” (Just & Carpenter, 1976), that is, that gaze reflects the deployment of cognitive resources to the external world. Accordingly, as reviewed below, studies have found that a number of gaze parameters are linked to mind wandering and are broadly thought to reflect the decoupling of attention from processing external stimuli that occur during mind wandering (Smallwood & Schooler, 2006). The first notion of studying eye movements to understand mind wandering stems from the 1960s when researchers found that eye movements and blinks were more frequent when participants were actively engaged in thinking or suppressing a daydream than when they were mind wandering (Antrobus et al., 1964). Subsequent studies have attempted to identify the eye movement correlates of mind wandering—mostly in the domain of reading—but findings have been mixed. Below, we will first discuss what normal reading behavior looks like, followed by a discussion of studies that have looked into how gaze behavior changes during mind wandering during reading and other tasks.

During attentive reading, eye movements typically follow a regular pattern (i.e., from word to word), and fixation durations—the period of time when gaze remains relatively still and new information is acquired—vary as a function of the length, frequency, and processing difficulty of the words in the text (Foulsham et al., 2013; Reichle et al., 2010), that is, more difficult words are associated with longer fixation durations on words, a pattern that is thought to reflect greater lexical and linguistic processing for that word. Moreover, roughly 10–15 percent of saccades—the period of time when the eyes are in motion—regress (backward eye movements) to previous words (Rayner et al., 2006) and become more frequent during difficult parts of the text where comprehension requires greater processing. Left-to-right saccades also typically become shorter, as measured by the angular distance of the saccade (saccade amplitude), within more difficult texts, so that each word is carefully fixated, processed, and understood (Rayner, 1998). Considered collectively, the gaze patterns observed during normal reading are thought to reflect systematically the real-time lexical and linguistic processing demands for the given text.

Studies investigating mind wandering during reading (i.e., mindless reading), however, have identified deviations in gaze patterns from focused reading, which, when considered collectively, suggest a decoupling between gaze and text features (D’Mello et al., 2013; Faber et al., 2018a; Loboda, 2014; Schad et al., 2012). As one illustrative example, Reichle et al. (2010) recorded eye movements as participants read an entire novel over the course of several days. Periodically, participants self-reported whether they were attentively reading or mind wandering in a given moment. Results showed that self-reported mind wandering was associated with longer fixation durations, with observable differences up to 120 s prior to the self-report. Furthermore, the variability in fixation durations associated with mindless reading was unrelated to word length or frequency, unlike fixation durations during normal reading. This finding in particular suggests that the link between eye movements and linguistic characteristics (e.g., longer fixation on longer, low-frequency words) breaks down during mindless reading.

Subsequent studies have focused on establishing the relationship between mindless reading and fixation parameters, such as the number of fixations, their duration, and dispersion (Bixler et al., 2015; Bixler & D’Mello, 2014; Faber et al., 2018a; Frank et al., 2015; Uzzaman & Joordens, 2011), but results are mixed in terms of direction and significance of the observed effects (e.g., Foulsham et al., 2013; Steindorf & Rummel, 2020). Moreover, there are also inconsistent findings regarding other gaze parameters, such as saccades. For instance, mindless reading has been associated with changes in left-to-right reading behavior, with some studies showing smaller (Bixler & D’Mello, 2014), longer (Bixler et al., 2015), and fewer saccades (Faber et al., 2018a) during mindless reading and others showing the opposite effects (Foulsham et al., 2013). Some studies indicate that mindless reading is also associated with fewer regressions (Foulsham et al., 2013; Reichle et al., 2010; Uzzaman & Joordens, 2011), but others have shown that this effect interacts with age (Frank et al., 2015). The collective conclusion from these investigations suggests that eye movements do change during mindless reading—which likely reflects a change in the cognitive processes that support comprehensive reading, such as lexical and linguistic processing—but the direct relationship between eye movements and mind wandering during reading remains underspecified. Moreover, it remains unclear what gaze behaviors are reflective of mind wandering more broadly—such as contexts with limited visual information (e.g., listening to an audio book) or stimuli that strongly direct visual attention (e.g., watching narrative films; Loschky et al., 2015)—which would provide further insight into how visual and cognitive processes operate under varying states of attention.

Indeed, relatively few studies have explored the gaze correlates of mind wandering in contexts other than reading and those that did display some heterogeneity in their observed associations. To illustrate, during narrative film comprehension, mind wandering is accompanied by a decrease in smooth pursuit of salient objects (Mills et al., 2016). When watching a lecture on the other hand, gaze parameters that capitalize on these local relationships (e.g., pursuit of salient objects) did not contribute much to the identification of mind wandering over and above the characteristics of fixations and saccades (Hutt et al., 2017). However, recent work has shown that viewers fixate more on the lecturer when mind wandering, and fixations in the lecture slides become longer and less dispersed (Zhang et al., 2020). For learners interacting with an intelligent tutoring system, mind wandering could be predicted from context-independent (global) gaze parameters such as fewer fixations and saccades, more dispersed fixations, and longer and slower saccades (Hutt et al., 2016, 2017). Likewise, when exploring a visual scene for a later memory task, fewer, longer, and more dispersed fixations were associated with mind wandering (Krasich et al., 2018). In the context of driving, however, He et al. (2011) found that participants made fewer horizontal saccades when mind wandering, which suggests smaller fixation dispersion and a reduced propensity to broadly scan the road. Although there is some consistency across these findings, collectively they suggest that task affordances might determine which gaze parameters are predictive of mind wandering in each context.

To address this issue, recent work has systematically investigated the gaze correlates of mind wandering across tasks (Faber et al., 2020). Specifically, seven brief tasks were used that vary in terms of spatial allocation demands, visual processing demands, and discourse processing demands. The tasks consisted of a sustained attention to response task (SART), listening to an audio book, reading a narrative story, studying a visual scene, studying an illustrated text, watching a recorded lecture, and watching a narrative film. Mind wandering during tasks that require extensive sampling of the visual field, such as reading, studying a scene, and studying a diagram, was associated with a decrease in fixations and, in some cases, with longer or more dispersed fixations. Taken together, these findings suggest that visual sampling becomes sparser across the board, although the specific gaze correlates might vary slightly across tasks. This sparsity supports the idea that self-generated thoughts are prioritized over the processing of external information during mind wandering, suggesting that a decrease in eye movements represents a global dampening in visual information processing. As discussed below, this account is supported by previous findings from neuroimaging research (e.g., Baird et al., 2014; Barron et al., 2011; Kam et al., 2011; Smallwood et al., 2008c).

In contrast, for tasks in which participants normally focus on a central fixation point, such as a SART, listening to an audio book, and watching a lecture, mind wandering was associated with shorter fixations, more dispersed fixations, and larger saccades, suggesting more exploratory eye movement behavior. However, these relationships were found to be less generalizable, suggesting that eye movement behaviors might not be robust behavioral signatures of mind wandering in these contexts. In addition, these fixation parameters were found to not be predictive of mind wandering during narrative film watching. The processing demands of narrative films differ from those of other stimulus contexts in that narrative films are heavily edited to guide attention (see Zacks, 2015) and, therefore, gaze (Loschky et al., 2015) and mind wandering (Faber et al., 2018b), such that mind wandering is less likely to occur during periods in which there are more changes in the depicted events (e.g., change in scene, shift in time). Eye movements might be more strongly predicted by whether the eyes follow the salient characters and/or objects rather than by a global dampening in visual processing (Mills et al., 2016).

The idea that attentional decoupling during mind wandering might increase the likelihood that the eyes also “wander away” has previously been phrased in terms of an exploration-exploitation tradeoff (Jepma & Nieuwenhuis, 2011). Previous work has shown that mind wandering during a stop-signal paradigm (which is similar to a SART in terms of visual presentation) is related to an increase in exploratory behavior (Mittner et al., 2014). This behavior is thought to be modulated by the locus coeruleus-norepinephrine (LC-NE) system and has previously been linked to changes in pupil diameter (Jepma & Nieuwenhuis, 2011) that are thought to index cognitive load (Granholm et al., 1996). A number of studies have shown (often conflicting) associations between mind wandering and pupil size and response using tasks with a central fixation point (Franklin et al., 2013; Grandchamp et al., 2014; Konishi et al., 2017; Mittner et al., 2014; Unsworth & Robison, 2016). However, pupillometry is not necessarily suitable for all tasks: for tasks that require extensive sampling of the visual field, each fixation would be accompanied by a difference in luminance and other low-level visual properties that can impact pupil diameter independently from the cognitive state of the observer. Moreover, in free viewing tasks, measurements of pupil diameter can be confounded due to changes in eye orientation when looking at the edges of the screen (Hayes & Petrov, 2016). Still, the (albeit not entirely understood) relationship between mind wandering and pupil diameter supports the idea that mind wandering is associated with an exploration-exploitation tradeoff in tasks that afford fixations focused on a small area of the visual field.

Taken together, the findings reviewed above suggest that there are patterns of eye movement deviations that are predictive of mind wandering, but it is likely that idiosyncrasies across tasks hinder the identification of one set of eye movement behaviors that generalize across all potential situations.

EEG Findings

Studies using electroencephalography (EEG) are becoming increasingly popular in the study of mind wandering across a variety of fields, ranging from psychology to brain computer interface research. In comparison to eye tracking, EEG measurements are more expensive, more intrusive, and less scalableFootnote 1 to naturalistic settings such as classrooms (D’Mello et al., 2016). EEG is useful for measuring brain activity time-locked to stimuli under controlled situations but is difficult to interpret in more complex tasks that rely on naturalistic variation, such as reading a book or watching a movie, as the design of an EEG experiment critically relies on a comparison between conditions and/or against a baseline. In addition, EEG has a poor spatial resolution, as it is only capable of measuring electrical activity at the surface of the cortex, making it difficult to localize signals which originate deeper in the brain (Sturzbecher & de Araujo, 2012). However, EEG has very high temporal precision, capable of recording from 250 to over 2000 samples of electrical brain activity per second. As such, it provides valuable insight into the evolution of cognitive processes during mind wandering across time. Moreover, when triangulated with findings from other modalities, it helps to paint a fuller picture of the dynamics associated with mind wandering. Results from these studies contribute to increasing our understanding of the cognitive processes underlying mind wandering states. Similar to the pattern found in eye tracking studies, findings do not always converge across EEG studies. In what follows, we give an overview of brain signatures typical of focused attention toward a task, followed by an overview of studies that have looked into how EEG measures change during mind wandering. We first discuss findings related to event-related potentials, and we subsequently focus on oscillatory findings. We then attempt to reconcile seemingly disparate findings by proposing that there is no “one-size-fits-all” neural signature of mind wandering but that, instead, this signature varies according to the type of task being performed as well as individual differences.

Neural activity has been often investigated during sustained attention tasks, such as under different variations of the oddball task, in which participants are requested to respond to rare stimuli, or of the sustained attention to response task (SART), where they withhold responses to rare stimuli. These are generally monotonous tasks which require participants to respond to stimuli over extended periods of time. Focused, on-task behavior during these tasks is accompanied by early event-related potentials (ERPs) elicited by early attention control mechanisms in occipital regions of the brain (Hillyard et al., 1998). The P100, which occurs within approximately 100 milliseconds of stimulus onset is evoked in response to visual stimuli. The N100, which occurs in this same time frame, is elicited by auditory stimuli. These early components are thought to be related to alerting attentional mechanisms (Hillyard et al., 1998). In addition, to early sensory responses, focused behavior is also associated with a later component, namely, the P300 ERP, in parietal and occipital regions of the brain (Polich, 2007). This response is presumed to be driven by the activation of orienting networks and working memory updating and is indicative of cortical processing of stimuli or events (Dehaene & Changeux, 2011; Mashour et al., 2020).

Moreover, patterns of oscillatory activities have also been investigated in relation to focused attention. Oscillatory activity in the beta (13–30 Hz) frequency band has been associated with task-related, visual attention (Gola et al., 2013; Laufs et al., 2006). Activity in the gamma (30–100 Hz) frequency band has been associated with executive attention, working memory, and long-term memory activation (Jensen et al., 2007). Theta (4–7 Hz) and alpha (8–13 Hz) frequencies have been associated with top-down processes and working memory (Baird et al., 2014; Sauseng et al., 2005). In addition, a reduction in alpha band power has been commonly observed when attention is oriented toward an external visual task (Klimesch, 2012; Mann et al., 1996; Pfurtscheller et al., 1996).

Studies investigating mind wandering during sustained attention tasks report changes in both ERP responses and in oscillatory patterns characteristic of focused attention. With regard to ERP responses, there appears to be an attenuation in both early and later components during mind wandering. Several studies consistently describe an attenuation of the P300, indicating a decoupling of top-down attentional processes (e.g., Barron et al., 2011; Kam et al., 2011; Smallwood et al., 2008a). Other studies also report an attenuation in the P100 and N100 components (Baird et al., 2014; Kam et al., 2011), indicative of sensory-motor decoupling in the visual and auditory domains, respectively. The fact that some studies have found differences in early sensory components and others have not can be explained by differences in the types of tasks being performed. Studies that fail to find changes in sensory ERPs tend to present visual stimuli at fixation (i.e., standard versions of the SART). However, studies in which responses to parafoveal stimuli have been measured report attenuations in both the P100 and P300 (Kam et al., 2011) during mind wandering. Moreover, an attenuation in the N100 component has been found in tasks requiring participants to respond to auditory stimuli (Braboszcz & Delorme, 2011; Kam et al., 2011).

With regard to oscillatory activity, variations in alpha rhythm play an important part in both perception and attention (Klimesch, 2012). Increases in alpha have been associated with internal processing (Benedek et al., 2014), supporting the notion of a decoupling from the environment during mind wandering. In a rapid serial visual presentation (RSVP) task, pre-stimulus alpha was found to increase over parieto-occipital sites (Macdonald et al., 2011). Similarly, Compton et al. (2019)Footnote 2 found increases in alpha power measured up to 10 seconds prior to reports of mind wandering over frontal, central, parietal, and occipital scalp areas during a Stroop Task – with higher alpha toward posterior sections of the scalp. Under a more ecological experiment consisting of a driving simulation, Baldwin et al. (2017) also found alpha to increase in posterior scalp areas.

In contrast, Baird et al. (2014) found a reduction in event-related alpha (9–11 Hz) and beta (15–30 Hz) spectral power over frontal,Footnote 3 central, and parietalFootnote 4 scalp regions after stimulus onset during an undemanding vigilance task.Footnote 5 Moreover, they found a decrease in theta band (4–7 Hz) cortical phase-locking over parietal regions of the brain. Lutz et al. (2008) propose that increase in phase-locking is related to a reduced tendency to engage in task-unrelated thoughts (see Cahn et al., 2013).

During a breath counting task with a passive auditory protocol (which participants performed with their eyes closed), Braboszcz and Delorme (2011) found decreases in alpha and beta activity in occipital and fronto-lateral areas, respectively, prior to self-caught episodes of mind wandering. Moreover, they found increases in theta band oscillations over all scalp regions to be associated with mind wandering, which were particularly more pronounced over occipital and parietal areas. Similarly, van Son et al. (2019) found a greater theta-beta ratio in frontal scalp areas during mind wandering. Increased theta oscillations are typically associated with decreases in sustained task-related attention and during transitional stages from wakefulness to sleep (Braboszcz & Delorme, 2011; Klimesch, 1999), while a higher theta-beta ratio has been often related to lower attentional control (van Son et al., 2019).

While some studies indicate that increases in alpha (Baldwin et al., 2017; Benedek et al., 2014; Compton et al., 2019; Macdonald et al., 2011)—particularly over parietal and occipital scalp areas—are a distinct neural signature of mind wandering, others propose increases in theta and decreases in alpha instead (Baird et al., 2014; Braboszcz & Delorme, 2011). What we notice is that alpha seems to increase during mind wandering in tasks requiring sustained visual attention to the external environment. As such, it seems to be more indicative of a visual sensory-motor decoupling during mind wandering. During tasks which do not require visual attention, but attention to auditory stimuli (or internal states) instead, alpha increase seems to be actually related to increased processing (Cartocci et al., 2018; Wisniewski et al., 2017), while decreased alpha has been shown to be related to low levels of vigilance (Braboszcz & Delorme, 2011). Meanwhile theta seems to increase prior to mind wandering reports in situations which do not require attentiveness to external stimuli, that is, in tasks with low to no visual perceptual acuity. For example, in the breath counting task in Braboszcz and Delorme’s (2011) study, performance did not require any responses to external stimuli but rather attentiveness to internal states. As such, it is possible that theta increases reflected the ability to be aware and attentive to one’s own internal state, which was essential for catching the mind wandering.

Recent work has shown that theta band connectivity (i.e., the frequency-locked synchrony between two brain areas or networks) between the default mode network (DMN) and a subsystem of the frontoparietal control network that is associated with abstract thinking, emotional processing, episodic and prospective memory, and mental simulation of events (Dixon et al., 2018) increases when attention is directed inward (Kam et al., 2019). This observation is in line with a wealth of studies that have previously shown that DMN activity is linked to cognitive processes that require internally focused attentionFootnote 6 (vs external attention), including mind wandering.Footnote 7 Simultaneous EEG and fMRI have shown associations between theta activity and the BOLD response in the DMN during resting state (i.e., a state without a task), suggesting that, indeed, theta activity might be an EEG marker of mind wandering (Scheeringa et al., 2008). Paralleling these findings, Kirschner et al. (2012) found increased connectivity also in the alpha, beta, and gamma frequency bands across regions of the DMN preceding mind wandering reports, suggesting convergence across different neuroimaging markers.

Evidence from resting state (i.e., task-free) and task fMRI has also revealed that individual variations in mind wandering propensity can be linked to variations in static DMN functional connectivity, whereas ongoing mind wandering episodes are reflected in time-varying DMN functional connectivity (Kucyi & Davis, 2014), suggesting that both trait and state levels of mind wandering can be measured using fMRI. This opens up the opportunity to study mind wandering and its unfolding during tasks that are difficult to study using EEG, such as reading or watching a film, due to the fact that the analysis of EEG data critically relies on a comparison between different conditions or against a baseline. Although the field of detecting mind wandering using fMRI is still in its infancy, recent work has shown that it is indeed possible to use multivariate pattern classification of fMRI data to distinguish between distinct experiential axes of mind wandering (Wang et al., 2018). It is likely that these methodological advances will enable more in-depth characterizations of the neural signatures of mind wandering. Despite the fact that fMRI might never be scalable to classroom situations, these insights are nevertheless important for understanding the cognitive processes that underlie the neurophysiological features that we can measure using other sensing technologies during learning. In particular, they can shed light on the question whether the observed patterns of task (−cluster)-specific neurophysiological features (e.g., eye movements, motor movements, ERP signals) are associated with distinct up- and down-regulations of brain networks that vary across those tasks/clusters in a principled manner and whether there are differences and/or similarities between tasks that are not reflected in other sensing modalities.

Although the focus of this chapter is on neurophysiological features, there are also behavioral indices that are associated with mind wandering. Previous work has, for instance, used facial features, posture, response times, and mouse tracking to distinguish between on- and off-task states. Several facial features, such as lowering of brows, raising cheeks, wrinkling nose, tightening lips, dimples, and dropping of the jaw, have been associated with mind wandering across tasks contexts (e.g., reading an expository text and watching a narrative film; Stewart et al., 2017). In addition, posture changes during mind wandering such that the face drops and moves closer to the screen (Stewart et al., 2017). Mouse movements (Dias da Silva & Postma, 2020) and reaction times (Bastian & Sackur, 2013; Smallwood, McSpadden, Luus, & Schooler, 2008b) become slower and more variable, although results vary across tasks. A more detailed report of these bodily features and their relationship to mind wandering can be found in (Dias da Silva et al., 2022) this book.

In the context of education, a triangulation of neurophysiological and behavioral (or bodily) features with self-reports can be helpful for identifying indirect measures of mind wandering that are generalizable across different learning contexts. If successful, this could lead to the development of “attention-aware” learning tools, such as software that helps the learner get back on track when they go off-task (D’Mello et al., 2016). For these strategies to be successful, it is necessary to identify which constellations of features are most likely to signal mind wandering during a variety of learning activities. In the remainder of this chapter, we will discuss machine learning studies that have attempted to predict mind wandering from neurophysiological and bodily features in the context of learning. In particular, we focus on how successful these methods are at detecting mind wandering in the context of an intelligent tutoring system—a computerized system that encompasses several learning activities including, e.g., reading, exercises, lectures, and animations—in the lab and in the classroom.

Mind Wandering Detection

In the past decade, applied research on mind wandering in the context of intelligent tutoring systems has been greatly facilitated by the advances in predictive modeling by means of machine learning. The main advantage of using data-driven techniques for mind wandering detection over traditional behavioral statistics is that the machine learning models are evaluated on the basis of their fit on unseen data, thereby preventing overfitting and the resulting lack of generalizability. The general goal of these approaches is to use detectable behavioral and psychophysiological cues, such as upper body movement (Stewart et al., 2017) including head pose (Bosch & D’Mello, 2019), facial features (Stewart et al., 2017), gaze patterns (Bixler & D’Mello, 2016; Blanchard et al., 2014; Brishtel et al., 2020; Faber et al., 2018a; Hutt et al., 2016, 2017; Zhao et al., 2017), electrodermal features (Brishtel et al., 2020), EEG (Hosseini & Guo, 2019; Jin et al., 2019), and heart rate changes (Pham & Wang, 2015) to predict upcoming episodes of mind wandering during a learning task. The negative impact of zoning out of the task could then be potentially alleviated by a reactive intervention or an alert, e.g., a sound or a visual stimulus, possibly representing detected levels of attention in real time (Mills et al., 2020). The alerting mechanism would draw the learner’s attention back to the task at hand or would provide more engaging and relevant input. Next to that, a time-referenced analysis of the mind wandering data can be used post hoc to improve the educational tool itself, as in the case of the AttentiveLearner (Pham & Wang, 2015).

The most frequently used algorithms for predictive modeling of mind wandering by means of supervised learning include different kinds of logistic regression (Bixler & D’Mello, 2014; Hutt et al., 2016; Pham & Wang, 2015; Stewart et al., 2017; Zhao et al., 2017), random forests (Bixler & D’Mello, 2014; Brishtel et al., 2020; Hutt et al., 2016; Stewart et al., 2017), and support vector machines (Bixler & D’Mello, 2014; Bosch & D’Mello, 2019; Hosseini & Guo, 2019; Hutt et al., 2016; Jin et al., 2019, 2020; Pham & Wang, 2015; Zhao et al., 2017). Given that episodes of reported mind wandering tend to occur less frequently than on-task instances, the datasets on which the algorithms applied are often first preprocessed by class balancing techniques such as SMOTE (Synthetic Minority Over-sampling Technique) (Stewart et al., 2017). The technique, applied on the training set, oversamples the minority class, i.e., it synthesizes data points in the mind wandering class based on the values of available instances in the same class. The classifiers are typically trained and evaluated by means of the leave-one-participant-out or leave-several-participants-out cross-validation method (i.e., the data from a single user or multiple users are only included in the training set or in the test set, but not in both) to ensure that they are robust enough to perform independently of the user. The reported performance of the best classifiers is currently around 70% of accuracy for the user independent models with binary classification using machine learning models (Bixler & D’Mello, 2014; Pham & Wang, 2015). In addition to the standard performance metrics, some studies report the predictive validity of the model by correlating the predicted (rather than actual) rates of mind wandering to learner performance (Bixler & D’Mello, 2014). An issue reported in several studies concerns the prevalence of false positives (recall greater than precision). As noted by Stewart et al. (2017), this is relevant for the implementation of the mind wandering algorithms in real-world applications, since an overuse of alerts and interventions might have a demotivating effect on the learner.

Most recently, several studies on mind wandering detection make use of the deep learning architectures such as convolutional neural nets (CNNs) and feedforward deep neural networks (DNNs). The performance of deep learning models, which are increasingly being used to analyze EEG signals, appears to exceed that of traditional machine learners for brain activity measures. For example, Hosseini and Guo (2019) reported an accuracy of 91.78% using a channel-wise deep CNN model. An additional advantage of the approach was that no feature extraction was necessary in the preprocessing stage. However, the study employed a dataset collected from only two participants; it thus remains to be seen to what extent the resulting model is applicable for other users. Next to EEG data, the deep learning approach was also tried out on in combination with automatic computer vision methods. Using two larger datasets of participants with data collected in a lab setting and a classroom setting, Bosch and D’Mello (2019) tested a DNN model for combination of features extracted from the upper body movement and facial expressions. Based on the F1 and AUC metrics, their DNN classifiers were able to perform somewhat above chance level but worse than a support vector machine (SVM) classifier, possibly due to the dataset size. Since human observer performance on the same dataset was rather poor, the modest performance of the two types of classifiers in general was likely due to the difficulty of the task suggesting that observable high-level features such as facial action units and upper body movement may not be the most reliable indicators of mind wandering episodes.

In an attempt to classify mind wandering episodes from EEG measures in more naturalistic settings, Conrad (2008) implemented various machine learning classifiers while students attended an online lecture. While watching a 50 minute lecture, participants were asked to click on a button whenever they caught themselves mind wandering. A linear discriminant analysis revealed that mind wandering could be distinguished from on-task states with an accuracy of 74% using both ERPs and frequency band oscillations. In addition, Dhindsa et al. (2019) recorded EEG activity from participants during a live lecture and intermittently asked participants whether they were mind wandering. Nonlinear SVMs were able to classify mind wandering episodes in individuals based on EEG features derived through data-driven feature learning (common spatial patterns) with accuracies over 80%. In addition to lecture settings, Hutt, Mills, White, Donnelly, and D’Mello et al. (2016) implemented a mind wandering detector which classified student’s mind wandering episodes during interaction with an intelligent tutoring system from gaze features considerably above chance levels.

Conclusion

In recent years, research has aimed to triangulate subject self-reports with indirect behavioral and neurophysiological measures to provide a more comprehensive measure of mind wandering. In this chapter, we show that the main challenge for detecting mind wandering from neurophysiological features across tasks lies in the fact that groups of tasks appear to vary in terms of the clusters of features predictive of mind wandering. As we have shown, this is the case for eye movements, where there are clear discrepancies between tasks with different visual affordances and smaller deviations across tasks that vary in other task demands. As with eye movements, changes in frequencies of neural activity as measured with EEG vary across tasks, with specific task demands being related to whether activity is higher or lower for a specific frequency band. EEG has mainly been applied in the context of visual attention tasks, such as SART and other vigilance tasks. Although recent work has extended into domains that are more relevant for education, such as (online) lectures, relatively little is known about how brain activity changes during mind wandering in learning contexts. However, the success rates of several machine learning attempts suggest that EEG signals—potentially in combination with other measures such as eye movements or bodily behaviors—as mind wandering correlates might be one of the most promising ways forward in terms of measuring mind wandering from neurophysiological data.

However, there are still other challenges that need to be addressed. An important issue is scalability. Currently, EEG is not scalable due to it being expensive and intrusive, and cheaper EEG sensors tend to have inferior signal quality. Eye tracking might be a better option since they are relatively cheap and unintrusive, and with the development of better webcams and better analytical strategies for the detection of eye movement signal from video data, eye tracking might in the future be possible on a laptop, phone, or tablet without any additional hardware. Although neurophysiological measures—in particular EEG—might be good at distinguishing between mind wandering states at an individual level in both lab and classroom settings, patterns diverge across individuals. This might in part explain the discrepancies in findings that are observed across studies, in addition to or in interaction with task demands. As such, we propose that the availability of different deep learning architectures in combination with data collected from multiple participants across different channels including eye tracking, EEG, and fMRI may provide solutions to some of these challenges.