Keywords

A taxonomy of Attention

With roots in a program of research begun by Michael Posner over 40 years ago (Posner and Boies 1971 ) three isolable functions of attention —alertness, orienting, and executive control—have been identified and linked to specific neural networks (Posner and Peterson 1990; Fan et al. 2005 ). In the domain of space , where selection has been referred to as orienting and most of the research has been on visual orienting, two important distinctions were first made by Posner (1980) and have since been highlighted in work from Klein’s laboratory (for a review, see Klein 2009 ). One concerns whether selection is accomplished by an overt reorientation of the receptor surface (an eye movement) or by a covert reorientation of internal information processing mechanisms. The other concerns whether the eye movement system or attention is controlled primarily by exogenous (often characterized as bottom-up or reflexive) means or by endogenous (often characterized as top-down or voluntary) means.

Helmholtz provided the first demonstration that attention could be shifted covertly and consequently independently of the direction of gaze. When control is purely endogenous, (Klein 1980 ; Klein and Pontefract 1994) and others (e.g., Hunt and Kingstone 2003 ; Schall and Thompson 2011 ) have demonstrated that such shifts of attention are not accomplished via sub-threshold programming of the oculomotor system . On the other hand, when orienting is controlled exogenously, by bottom-up stimulation, it is difficult to disentangle activation of covert orienting from activation of the oculomotor programs.

In the domain of covert orienting, Klein has emphasized the importance of distinguishing between whether control is (primarily) endogenous or exogenous because different resources or mechanisms seem to be recruited to the selected location or object when the two different control systems are employed. This assertion was first supported by the following double dissociation: (1) When exogenously controlled, attention interacts with opportunities for illusory conjunctions and is additive with non-spatial expectancies, and (2) when endogenously controlled, attention is additive with opportunities for illusory conjunctions and interacts with non-spatial expectancies (Briand and Klein 1987; Briand 1998 ; Handy et al. 2001 ; Klein and Hansen 1990; Klein 1994 ). Several other dissociations discovered by others reinforce Klein’s conclusion that different resources are recruited when orienting is controlled endogenously versus exogenously (for reviews, see Klein 2004, 2009).

Thinking about the importance of this distinction in the world of orienting led Klein and Lawrence to propose an alternative taxonomy (Klein and Lawrence 2011 ), illustrated in Fig. 1, in which two modes of control (endogenous and exogenous) operate in different domains time, space, modality, task, etc.). Searching entails the endogenous and exogenous control of attention in space and time . In contrast to the literature using Posner’s cuing paradigm, however, in typical search tasks the endogenous/exogenous distinction is often not made explicit. In spatial search, for example, perhaps this is because even when search is hard (the target does not exogenously capture attention) we typically do not experience volitional control of the search process—of the sequence of decisions about where to look next for the target . It has been suggested that these “decisions” are typically made by low-level subroutines (Klein and Dukewich 2006 ). It seems likely that the endogenous control of search is instantiated before the search episode begins based on the observer’s knowledge about properties of the target (setting up a template matching process) and distractors (e.g. establishing attentional control settings to implement guided search).

Fig. 1
figure 1

A taxonomy of attention proposed by Klein and Lawrence (2011)

Natural History of a Search Episode

A typical search episode begins with some specification of what the target is; usually some information about the nature of the material to be searched through for the target; perhaps some useful information on how to find it; and, critically, what to do when it is found. The human searcher is thought to incorporate these tasks- or goal-oriented elements into a mental set, program or strategy so that their performance will optimize their payoffs. In Broadbent’s theory (1958 ) an important component of this process was “setting the filter” so that task-relevant items (targets) would have access to limited capacity processing mechanisms while task irrelevant items would be excluded. Duncan (1981) would later provide a useful recasting of Broadbent’s ideas. Instead of “filtering” he referred to a “selection schedule” and, recognizing the many empirical demonstrations that an unselected stimulus could nevertheless activate complex internal representations, he suggested that the limitation has more to do with availability for reporting an item than the quality or nature of an item’s internal representation. We see subsequently proposed endogenous control mechanisms such as attentional control settings (ACS) (Folk et al. 1992 ) and “task-set reconfiguration” (Monsel 1996) as firmly rooted in these earlier ideas.

During the search episode the efficient performer must represent the target and the feature(s) that will distinguish the target from the distractors. Representations activated by the spatial search array or temporal search stream are compared against these representations to determine if the target is present and if so to report its properties according to observer’s goals. This comparison process might take place one at a time or in parallel across the items in the search array or stream.

Two paradigms for exploring the information processing dynamics of searching will be emphasized in this chapter. These paradigms were developed to study, in relatively pure form, searching in space and in time . Searching in space entails the allocation of attention to items distributed in space and presented at the same time. Searching in time entails the allocation of attention to items distributed in time and presented at the same location. With a few exceptions (e.g., Arend et al. 2009 ; Keele et al. 1988 ; McLean et al. 1982 ; Vul and Rich 2010 ) searching in space and time has been studied separately, usually in studies with a similar objective: understanding the role of attention in detecting, identifying, or localizing targets . We believe that it will be empirically fruitful and theoretically timely for these somewhat separate efforts to be combined. And, it will be useful, because in the real world searching often combines these two pure forms.

Searching in Space

There are many studies from before 1980 that used a wide variety of spatial search tasks, The spatial search paradigm emphasized here (see Fig. 2) was imbued with excitement by Anne Treisman’s (Treisman and Gelade 1980 ; Treisman and Schmidt 1982) use of it to provide support for her feature integration theory in which spatial attention is the binding agent for otherwise free-floating features . When observers are asked to indicate whether a target is present in an array of distractors, two dramatically different patterns are frequently reported. In one case (i.e., difficult search—target is not defined by a single unique feature), illustrated in Fig. 2, reaction time for both target absent and target present trials is a roughly linear function of the number of distractors and the slope for the target absent trials is approximately twice that of the present trials . This pattern is intuitively compatible with (indeed predicted by) a serial self-terminating search (SSTS) process in which each item (or small groups of items) is compared against a representation of the target and this process is repeated until a match is found or until the array has been exhausted . In the other case (not illustrated) (i.e., easy search—target is defined by a single unique feature), reaction time is unaffected by the number of distractor items. Phenomenologically, instead of having to search for the target, it “pops out” of the array.

Fig. 2
figure 2

A prototypical “present/absent” search task (is there a solid “O” in the display?) is illustrated on the left. Typical results illustrated on the right showing reaction time to make the decision (open symbols = target absent trials; filled symbols = target present trials) as a function of the number of items in the display. (Adapted from Treisman 1986)

This model task and the theory Treisman inferred from its use have been remarkably fruitful in generating: modifications of the model task (e.g., the preview-search paradigm of Watson and Humphreys 1997 ; the dynamic search paradigm of Horowitz and Wolfe 1998 ), theoretical debates (such as: are so-called “serial” search patterns like that illustrated in Fig. 2 caused by truly sequential or by parallel processes; and, when search is a sequential process of inspections, how much memory is there about rejected distractors, see Klein and Dukewich 2006, for a review), empirical generalizations (e.g., Wolfe’s 1998, re view; the search surface of Duncan and Humphreys 1989 ), and conceptual contributions (e.g., the guided search proposal of Wolfe et al. 1989; the foraging facilitator proposal of Klein 1988 ).

The model task and the theory of Treisman encouraged Klein and Dukewich (2006) to address the question whether search is primarily driven by serial or parallel mechanisms. While rooted in basic research on spatial search , we believe that their advice applies equally to searching in time and to real-world search behavior:

When there is more than one good strategy to solve a problem it seems reasonable to assume that nature may have figured out a way to take advantage of both….We recommend that future research seek to determine, rather than which strategy characterizes search, ‘‘when’’ and ‘‘how’’ the two strategies combine. (Klein and Dukewich 2006, p. 651 )

Searching in Time

In the mid-1960’s Molly Potter discovered that people could read when the text was presented using rapid serial visual presentation (RSVP) , that is with words presented one after the other at the same location in a rapid sequence. A few decades later this mode of stimulus presentation began to be used as a tool for exploring the consequences of limited processing capacity , particularly for dealing with multiple “targets” in streams of unrelated items (Broadbent and Broadbent 1987 ; Weichselgartner and Sperling 1987 ). Broadbent and Broadbent (1987), for example, showed how difficult it is to identify two targets when they are in close succession.

The difficulty identifying subsequent items after successfully identifying an earlier one was subsequently named an “attentional blink ” by Raymond et al. (1992 ). The blink and the task for exploring it that was developed by Broadbent, Raymond and Shapiro propelled this paradigm to the center stage of attention research. In the seminal paradigm of Raymond et al. (1992) (see Fig. 3, left/bottom), multiple letters are presented rapidly and sequentially at the same location (in RSVP). In the sequence of letters, all but one of which are black, there are two targets (separated by varying numbers of distractors) and the observer has two tasks: Report the identity of the white letter and report if there was an X in the stream of letters after the white letter.

Fig. 3
figure 3

Two different methods that have been used to explore the attentional blink. Both entail presenting a sequence of individual alphanumeric items using RSVP (with about 100 ms separating item onsets). The stream on the left illustrates the “detect-X” task pioneered by Raymond et al. (1992). After a random number of black letters the first target, a white letter (T1), is displayed. Then, at varying lags after the presentation of T1 an X (T2 or probe) might or might not (this alternative is shown in the box with the dashed line) be presented. At the end of the stream the observer reports the identity of the white letter and whether or not an X had appeared in the stream. Typical results from this task are shown in the inset at the bottom. Open symbols show the probability of correctly reporting that an X was present as a function of its position following a white letter when that letter had been correctly identified. Filled symbols show the same results when there was no requirement to report the white letter. The stream on the right illustrates the paradigm developed by Chun and Potter (1995) and used by many others. Here there is a stream of items in one category (digits) in which two targets from another category (letters) are embedded. At the end of the stream the observer’s task is to report the identities of the two targets. Typical results (accuracy of T2 reports when T1 was identified correctly) are shown in the inset at the top

One possible weakness of this particular paradigm (often called “detect X”) is that the “blink” it generates and measures may have quite different sources: double speeded identification and switching the mental set (the selection schedule or filter setting) from color to form (“white” to “X”). A more general paradigm (that is more like Broadbent’s) is often used to avoid such switching. Chun and Potter (1995) used one version of this paradigm (Fig. 3, right/top) in which the observers task is to report the identity of two letter targets that are embedded in a stream of digits.

As with the spatial search paradigm, these methods for exploring “searching in time” using one or more targets embedded in a stream of rapidly presented items, have been remarkably fruitful in generating: modifications of the model task, theoretical debates, empirical generalizations , and conceptual contributions (e.g., Dux and Marios 2009 for a review ).

Searching in Space and Time: Some Comparisons

The Nature of the Stimuli

It seems likely that if a certain kind of stimulus pops out in a spatial search it might also do so in temporal search and vice versa. Duncan and Humphreys (1989) identified two principles that interact in determining the difficulty of searching in space for a target among distractors. One factor is: How similar is the target to the distractors? The other is: How heterogeneous are the distractors? How these factors interact to determine search difficulty (see Fig. 4) was described by Duncan and Humphreys (1989) ; neither factor alone makes searching particularly hard, but when combined they interact and conspire to make search extremely difficult. Would searching in time (in RSVP ) show the same relationship? While there are hints that this might be true, there are no dedicated studies that we are aware of.

Fig. 4
figure 4

Center: The “search surface” (adapted from Duncan and Humphreys 1989) represents the difficulty of finding a target (height of the surface is the predicted slope of the reaction time/set size function) as a function of two properties of the search array: target distractor similarity and distractor heterogeniety. Corners: Sample search arrays illustrating the four corners of the search surface. The line with the obviously unique slope in the lower left panel is the target in all four panels. The target is easily found when it is accompanied by a homogenous array of distractors of a very different orientation (lower left)

There are a variety of other stimulus features for which we could pose a similar question: If your own name pops out of an RSVP stream and even escapes the attentional blink will it also pop out in spatial search? Will socially important stimuli such as faces (emotional or otherwise) capture attention in both space and time? Given the history of this symposium, we can ask “What does motivation have to do with it?” For example, would pictures of food be easier to find when you are hungry than after you have just eaten? Will attention be captured by stimuli that have been previously rewarded?

The Participants

There are many participant factors that could be explored. We would expect searching in space and time to show similar benefits from training and expertise , for example. The same expectation would apply to developmental changes. Exploring the efficiency of spatial search across the lifespan, Hommel et al. (2004) found a U-shaped function with less efficient performance at the extremes. Based on their findings, if you have recently turned 25 or so, you are at your peak. A similar pattern, though perhaps with a slightly older “optimum” age, was reported for the magnitude of the attentional blink by Georgiou-Karistianis et al. (2007) Looking at patients with focal brain damage or known neurological problems would provide an arena for comparison that could have relevance to the neural systems involved in search. Examples described here are from studies of patients with unilateral neglect, a disorder commonly associated with parietal lesions. In spatial search tasks patients with neglect are slower and less likely to find targets, particularly when these are present in the neglected hemifield (e.g., Butler et al. 2009 ; Eglin et al. 1989 ). The right-to-left gradient of increasing omissions (see Fig. 5a) might be related to a difficulty disengaging attention from attended items toward items in the neglected field (for a review, see Losier and Klein 2001 ). Poor performance, particularly repeated reports of targets (cf Butler et al. 2009), might be attributed, in part, to defective spatiotopic coding of inhibition of return (IOR) which depends on an intact right parietal lobe (Sapir et al. 2004 ). This would converge with the proposal that the function of IOR is to encourage orienting to novelty (Posner and Cohen 1984 ) and, consequently, to discourage reinpsections (Klein 1988 ). Using an RSVP task , Husain et al. (1997) showed that the attentional blink was longer and deeper in patients suffering from visuo-spatial neglect due to damage to the right hemipshere. In this study, all the items were presented at fixation ., Consequently, this temporal deficit might be a more general version of the aforementioned disengage deficit: difficulty disengaging attention from any item on which it is engaged.

Fig. 5
figure 5

Spatial and temporal processing in patients suffering from neglect and control participants. a Probability of report [by normal controls (NC), control patients with right hemisphere lesions (RHC), and patients suffering from neglect following damage to the right hemisphere (NEG)] of target letters and numbers among non-alphanumeric distractors presented in a 20 by 30 degree spatial array in peripersonal and extrapersonal space (from Butler et al., 2009). b and c Probability of detecting an X in the “detect X” paradigm illustrated in Fig. 3. Unfilled squares represent performance when participants were not required to report the white letter in the stream (single task). Filled squares represent performance on the “detect X” task (second target) when participants were required to report the white letter (first target). b data from normal controls. c data from patients with neglect (Data in b and c are from Husain et al. (1997); figures b and c are adapted from Husain and Rorden (2003)

The role(s) of Endogenous Attention in Time and Space

As noted earlier the concept of limited capacity seems to play an important role in both kinds of search. When searching in space , one reflection of this limit is seen in the relatively steep slopes that characterize difficult searches (searches for which the target does not pop-out). As noted earlier, one way to explain steep slopes is in terms of the amount of time required for an attentional operator to sequentially inspect individual items in the array or to sequentially inspect regions (when it is possible for small sets of nearby items to be checked simultaneously) until the target is located. When searching in time this is seen as an attentional blink—in the period immediately following the successful identification of a target, some important target-identifying resources appear to be relatively unavailable.

An interesting difference that characterizes at least the standard versions of these tasks is that stimuli in RSVP are data limited : every item is both brief and masked while in a typical spatial search episode the stimulus array is neither brief nor masked. With multiple items displayed all at the same time , spatial search is characteristically resource limited. That noted, several researchers (e.g., Dukewich and Klein 2005 ; Eckstein 1998 ) have explored spatial search using limited exposure durations. And, while in this chapter we are concentrating on relatively pure examples of searching in space and time , there have also been some highly productive hybrids (such as the dynamic search condition of Horowitz and Wolfe 1998, 2003, ).

The ideas of attentional control settings and contingent capture seem to operate similarly in both space and time . In spatial search it has been demonstrated that attentional capture is contingent on the features one is searching for (Folk et al. 1992 ) as well as the locations where targets will be found (Ishigami et al. 2009 ; Yantis and Jonides 1990 ). Capture by distracting non-targets that share features with the target has also been demonstrated in temporal search (Folk et al. 2008 ).

Another aspect of attentional control concerns its intensity (Kahneman, 1973 ). For example, in his review of IOR, Klein (2000) proposed that the strength of attentional capture by task-irrelevant peripheral cues would depend directly on the degree to which completing the target task requires attention to peripheral onsets . As a consequence of increased capture, attentional disengagement from the cue and therefore the appearance of IOR would be delayed .

A similar mechanism was uncovered in our studies of the attentional blink . The initial question we (McLaughlin et al. 2001 ) posed was whether difficulty to identify the first target (T1), when varied randomly from trial-to-trial, would affect blink magnitude. We used the target-mask, target-mask paradigm (which, it must be noted, demonstrates that it is not necessary to use RSVP streams to explore searching in time) pioneered by Duncan et al. (1994) . As shown in the bottom panel of Fig. 6 we varied how much data was available about either T1 or T2 (second target) in order to implement an objective, quantifiable and data-driven difference in target identification difficulty. We designed the experiment so as to avoid any location or task switching (the task was simply to report the two letters). Despite the success of our data-driven manipulation of T1 difficulty, the answer to this question was a resounding “NO” (see top panel of Fig. 7)Footnote 1. When we manipulated the difficulty of T2, this had dramatic effects on T2 performance and no effect on T1 (bottom panel of Fig. 7).

Fig. 6
figure 6

Methods used by McLaughlin et al. (2001) to explore the effect of the difficulty of target (T) processing upon the magnitude of the blink using a target-mask, target-mask paradigm to induce and measure the blink. The difficulty of either T1 (first target) or T2 (second target) was manipulated by varying the relative durations of the target and mask (M)

Fig. 7
figure 7

Results from McLaughlin et al. (2001). (See Fig. 6 for explanation of the difficulty manipulation)

Why would such a dramatic difference in difficulty of T1 have no effect on the blink? We suggested that this was because the blink is about the effort the participant expects to have to exert in advance of the trial—an ACS that is about how much processing resources might be needed to perform the task. Because we randomly intermixed the 3 difficulty levels, and because (apparently) resources are not (or cannot be) re-allocated in real time when T1 is presented, all trials would have been subjected to the same ACS. We tested this proposal, in a subsequent paper (Shore et al. 2001 ), by comparing the results when the same data-driven manipulation of T1 difficulty was mixed or blocked. As predicted by an ACS view, when we blocked difficulty there was a significant effect of T1 difficulty on the magnitude of the AB (particularly between the hard and medium/easy conditions, See Fig. 8).

Fig. 8
figure 8

Results from Shore et al. (2001). Magnitude of the attentional blink as a function of T1 difficulty and whether T1 difficulty could be predicted (blocked) or not (randomly intermixed, as in McLaughlin et al. 2001)

There may be a related “strategic” effect in both the spatial and temporal search literatures. Smilek et al. (2006, ), in a paper entitled: “Relax! Cognitive strategy influences visual search” seemed to show that simply telling their participants not to try so hard reduced their slopes (i.e., increased their search efficiency). Similarly, Olivers and Nieuwenhuis (2005) reported that relaxing by listening to music could reduce the attentional blink .

Binding of Targets in Space and Time

We will end this section by describing one empirical strategy for comparing searching in space and in time . The background comes from two papers that reported interesting “slippage” of targets in space and time. The first, by Snyder (1972 ), was about searching in space; the second by McLean et al. (1982), was about searching in time. In Snyder’s study multiple items were presented briefly at the same time in different locations whereas in McLean et al. (1982) multiple items were presented rapidly in time at the same location. For present purposes we will emphasize the conditions in which the participant’s task was to report the identity of a target letter that was defined by color. As we will see, both studies reported a certain amount of sloppiness of the attentional beam (or window); whether the errors were true illusory conjunctions is not so important as their distribution in space and time .

In Snyder’s spatial search task, 12 letters were placed in a circular arrangement on cards for presentation using a tachistoscope Footnote 2. On each trial the participant had to verbally report the name of a uniquely colored letter and then report its position (using an imaginary clockface: 1–12). Stimulus duration was adjusted on an individual basis so that accuracy of the letter identification was about 50 % (regardless of accuracy of the letter localization). The key finding for present purposes was that when reporting identitiesFootnote 3, errors were more likely to be spatially adjacent to the target letter than further away. Snyder found a similar pattern of spatial slippage when the feature used to identify the target was form-based (a broken or inverted letter).

In McLean et al.’s temporal search task, the target color varied from trial to trial and the participant’s task was to report the identity of the single item presented in the target color. (In another condition the participant reported the color of a target defined by its identity.) Each stream, created photographically using movie film, consisted of 17 letters rendered using 5 different identities and 5 different colors. Films were projected on the screen with SOAs of 67 ms (15 flames/s). The key finding for present purposes was an excess of temporally adjacent intrusion errors relative to reports of items in the stream temporally more distant from the target (interestingly, immediate post-target intrusions were more likely than immediate pre-target intrusions).

If there were one attentional beam that operates in both space and time to integrate features into objectsFootnote 4, and if there are individual differences in the efficacy of this beam, then we would expect the spatial and temporal sloppiness that was reported by Snyder (1972) and McLean et al. (1982) to be correlated across individuals. To test this idea, data on spatial and temporal search must be obtained from the same participants. We have begun to explore this possibility and will briefly report some of our preliminary findings.

In our first project we tested 46 participants on spatial and temporal search tasks that were closely matched to those of Snyder (1972) and McLean et al. (1982) . The order of tasks was counterbalanced. In order to ensure that there would be a sufficient number of errors while performance would be substantially above chance, for each task we titrated the exposure duration so that overall accuracy in reporting the target’s identity was in the 50–60 % correct range. The key results are illustrated in Fig. 9.

Fig. 9.
figure 9

Results from Ishigami and Klein (2011). Observers were searching in space (left panel) and time (right panel) for a target of a pre-specified color. Accurate reports of the target’s identity are indicated in the percentages indicated above relative position = 0. The remaining data are the percentage of erroneous reports of items from the array (that were not the target) as a function of the distance (in space and time) of these items relative to the target. Positive positions are, relative to the target, clockwise in space and after in time

We were quite successful in achieving the overall level of accuracy we were aiming for (50–60 % correct). While the scales are different (there were fewer errors in the spatial task) the patterns are similar in space and time , and the key findings from Snyder (1972) and McLean et al. (1982) were replicated: errors are more likely to come from positions adjacent to the target. Moreover, in space there were more counter clockwise than clockwise errors; and in time there were more post- than pre-target errors. For each participant and task we computed a measure of “slippage” that was the average rate of near errors (± 1) minus the average rate of far errors (all other erroneous reports from the presented array). The correlation between spatial and temporal slippage was very close to zero (r 44 = 0.03) suggesting that the attentional beam that attaches identities to locations may not be the same beam that attaches identities to timeFootnote 5.

Conclusion

We have discussed the concept of attention —selection made necessary by limited processing capacity —and some of its manifestations in spatial and in temporal search behavior. As described in the chapter, searching in space and time has been typically studied separately predominantly with an objective to understand the role of attention in detecting, identifying, or localizing targets . However, in the real-world, we are often searching for targets that are surrounded by distractors in space and all of this happens in scenes that unfold over time (e.g., looking for a particular exit on a highway when driving; or your child in a busy playground). We described above our first attempt to compare searching in time and space in the same individuals. Preliminary results revealed a null correlation between spatial and temporal slippage suggesting different selection mechanisms in these two domains. We plan next to experimentally balance two tasks (space and time) so that we can have firmer conclusion about this relationship and merge our two tasks so that we can explore searching in space and time simultaneously.

In the course of this chapter we have raised several questions: Will the principles (Duncan and Humphreys 1989 ) that determine the difficulty of searching in space generalize to searching in time? Are the same brain regions responsible for spatial and temporal search (e.g., Arend et al. 2009 )? Do attentional control settings work in the same way in spatial and temporal search? Is the binding of features to space and to time implemented by one “beam” or by independent “beams,” each operating in its own domain? Answers to these questions which, in some cases, the literature is beginning to provide, will have important theoretical and practical implications.