Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

Cognitive neuroscience is a discipline that attempts to determine the neural mechanisms underlying cognitive processes. Specifically, cognitive neuroscientists test hypotheses about brain–behavior relationships that can be organized along two conceptual domains: functional specialization—the idea that functional modules exist within the brain, that is, areas of the cerebral cortex that are specialized for a specific cognitive process and functional integration—the idea that a cognitive process can be an emergent property of interactions among a network of brain regions which suggests that a brain region can play a different role across many functions.

Early investigations of brain–behavior relationships consisted of careful observation of individuals with neurological injuries resulting in focal brain damage. The idea of functional specialization evolved from hypotheses that damage to a particular brain region was responsible for a given behavioral syndrome that was characterized by a precise neurological examination. For instance, the association of nonfluent aphasia with right-sided limb weakness implicated the left hemisphere as the site of language abilities. Moreover, upon the death of a patient with a neurological disorder, clinicopathological correlations provided confirmatory information about the site of damage causing a specific neurobehavioral syndrome such as aphasia. For example, in 1861, Paul Broca’s observations of nonfluent aphasia in the setting of a damaged left inferior frontal gyrus cemented the belief that this brain region was critical for speech output (Broca 1861). The advent of structural brain imaging more than 100 years after Broca’s observations, first with computerized tomography and later with magnetic resonance imaging (MRI), paved the way for more precise anatomical localization in the living patient of the cognitive deficits that develop after brain injury. The superb spatial resolution of structural neuroimaging has reduced the reliance on the infrequently obtained autopsy for making brain–behavior correlations.

Functional neuroimaging, broadly defined as techniques that measure brain activity, has expanded our ability to study the neural basis of cognitive processes. One such method, fMRI, has emerged as an extremely powerful technique that affords excellent spatial and temporal resolution. Measuring regional brain activity in healthy subjects while they perform cognitive tasks links localized brain activity with specific behaviors. For example, functional neuroimaging studies have demonstrated that the left inferior frontal gyrus is consistently activated during the performance of speech production tasks in healthy individuals (Buckner et al. 1995). Such findings from functional neuroimaging are complementary to findings derived from observations of patients with focal brain damage. This chapter focuses on the principles, as well as the challenges, underlying fMRI as a cognitive neuroscience tool, highlighting many questions that fMRI is well suited to address.

Inference in Functional Neuroimaging Studies of Cognitive Processes

Insight regarding the link between brain and behavior can be gained through a variety of approaches. It is unlikely that any single neuroscience method is sufficient to fully investigate any particular question regarding the mechanisms underlying cognitive function. From a methodological point of view, each method will offer different temporal and spatial resolution. From a conceptual point of view, each method will provide data that will support different types of inferences that can be drawn from it. Thus, data obtained addressing a single question, but derived from multiple methods, can provide more comprehensive and inferentially sound conclusions.

Functional neuroimaging studies support inferences about the association of a particular brain system with a cognitive process. However, it is difficult to prove in such a study that the observed activity is necessary for an isolated cognitive process because perfect control over a subject’s cognitive processes during a functional neuroimaging experiment is never possible. Even if the task a subject performs is well designed, it is difficult to demonstrate conclusively that he or she is differentially engaging a single, identified cognitive process. The subject may engage in unwanted cognitive processes that either have no overt, measurable effects or are perfectly confounded with the process of interest. Consequently, the neural activity measured by the functional neuroimaging technique may result from some confounding neural computation that is itself not necessary for executing the cognitive process seemingly under study. It is important to note that the inferences that can be drawn from functional neuroimaging studies such as fMRI apply to all methods of physiological measurement (e.g., electroencephalography, EEG; magnetoencephalography, MEG).

The inference of necessity cannot be made without showing that inactivating a brain region disrupts the cognitive process in question. However, lesions in patients are often extensive, damaging local neurons and “fibers of passage.” For example, damage to prominent white matter tracts can cause cognitive deficits similar to those produced by cortical lesions, such as the amnesia resulting from lesions of the fornix, the main white matter pathway projecting from the hippocampus (Gaffan and Gaffan 1991). In addition, connections from region “A” may support the continued metabolic function of region “B,” but region A may not be computationally involved in certain processes undertaken by region B. Thus, damage to region A could impair the function of region B via two possible mechanisms: (1) diaschisis (Feeney and Baron 1986) and (2) retrograde transsynaptic degeneration. Consequently, studies of patients with focal lesions cannot conclusively demonstrate that the neurons within a specific region are themselves critical to the computational support of an impaired cognitive process.

Empirical studies using lesion and electrophysiological methods demonstrate these issues regarding the types of inferences that can be logically drawn from them. For example, in monkeys, single-unit recording reveals neurons in the lateral prefrontal cortex (PFC) that increase their firing during the delay between the presentation of information to be remembered and a few seconds later when that information must be recalled (Fuster and Alexander 1971; Funahashi et al. 1989). These studies are taken as evidence that persistent neural activity in the PFC is involved in temporary storage of information, a cognitive process known as working memory. The necessity of PFC for working memory was demonstrated in other monkey studies showing that PFC lesions impair performance on working memory tasks, but not on tasks that do not require temporarily holding information in memory (Funahashi et al. 1993). Persistent neural activity during working memory tasks are also found in the hippocampus (Watanabe and Niki 1985; Cahusac et al. 1989). Hippocampal lesions, however, do not impair performance on most working memory tasks (Alvarez et al. 1994), which suggests that the hippocampus is involved in maintaining information over short periods of time but is not necessary for this cognitive operation. Observations in humans support this notion (Jeneson et al. 2010; Corkin 1984). For example, the well-studied patient H. M., with complete bilateral hippocampal damage and the severe inability to learn new information, could nevertheless perform normally on working memory tasks such as digit span (Corkin 1984). The hippocampus is implicated in long-term memory especially when relations between multiple items or multiple features of a complex, novel item must be retained. Thus, the hippocampus may only be engaged during working memory tasks that requires someone to subsequently remember novel information (Ranganath and D’Esposito 2001).

When the results from lesion and functional neuroimaging studies are combined, a stronger level of inference emerges. As in the examples of Broca’s aphasia or working memory, a lesion of a specific brain region causes impairment of a given cognitive process and when engaged by an intact individual, that cognitive process evokes neural activity in the same brain region. Given these findings, the inference that this brain region is computationally necessary for the cognitive process is stronger than the data derived from each study performed in isolation. Thus, lesion and functional neuroimaging studies are complementary, each providing inferential support that the other lacks.

Other types of inferential failures can occur in the interpretation of functional neuroimaging studies when other common assumptions do not hold true. First, it is assumed that if a cognitive process activates a particular brain region (evoked by a particular task), the neural activity in that brain region must depend on engaging that particular cognitive process. For example, a brain region showing greater activation during the presentation of faces than to other types of stimuli, such as photographs of cars or buildings, is considered to engage in face perception processes. However, this region may also support other higher-level cognitive processes such as memory processes, in addition to lower-level perceptual processes (Druzgal and D’Esposito 2001). See (Henson 2006) for a further discussion of this issue.

The opposite type of inference is made when it is assumed that if a particular brain region is activated during the performance of a cognitive task, the subject must have engaged the cognitive process supported by that region during the task (referred to as a “reverse inference”). For example, observing activation of the frontal lobes during a mental rotation task, it was proposed that subjects engaged working memory processes to recall the identity of the rotated target (Cohen et al. 1996). (They derived this assumption from other imaging studies showing activation of the frontal lobes during working memory tasks.) However, in this example, because some other cognitive processes supported by the frontal lobes could have activated this region (D’Esposito et al. 1998), one cannot be sure that working memory was engaged leading to the activation of the frontal lobes. Unfortunately, this potentially faulty logic is fairly common practice in fMRI studies. See (Poldrack 2006) for a further discussion of this issue.

In summary, interpretation of the results of functional neuroimaging studies attempting to link brain and behavior rests on numerous assumptions. Familiarity with the types of inferences that can and cannot be drawn from these studies is helpful for assessing the validity of the findings reported by such studies.

Functional MRI as a Cognitive Neuroscience Tool

Functional MRI has become the predominant functional neuroimaging method for studying the neural basis of cognitive processes in humans. Compared to its predecessor, positron emission tomography (PET) scanning, fMRI offers many advantages. For example, MRI scanners are much more widely available, and imaging costs are less expensive since MRI does not require a cyclotron to produce radioisotopes. MRI is also a noninvasive procedure since there is no requirement for injection of a radioisotope into the bloodstream. Also, given the half-life of available radioisotopes, PET scanning is unable to provide comparable temporal resolution to that of fMRI which can provide images of behavioral events occurring on the order of seconds rather than the summation of many behavioral events over tens of seconds.

The MRI scanner, compared to a behavioral testing room, is less than ideal for performing most cognitive neuroscience experiments. Experiments are performed in the awkward position of lying on one’s back, often requiring subjects to visualize the presentation of stimuli through a mirror, in an acoustically noisy environment. Moreover, many individuals develop some degree of claustrophobia due to the small bore of the MRI scanner and find it difficult to remain completely motionless for a long duration of time that is required for most experiments (e.g., usually 60–90 min). These constraints of the MRI scanner make it especially difficult to scan children or certain patient populations (e.g., Parkinson’s disease patients). However, mock scanners have been built in many imaging centers, with motion devices, which acclimate children or patients to the scanner environment before they participate in an fMRI study.

All sensory systems have been investigated with fMRI including the visual, auditory, somatosensory, olfactory, and gustatory systems. Each system requires different technologies for successful presentation of relevant stimuli within an MRI environment. In brief, the most common means of presenting visual stimuli is via a liquid crystal display (LCD) projector system with the sophistication of the system depending on the quality of image resolution required for the experiment. Several options exist for auditory stimuli such as piezoelectric or electrostatic headphones. However, the biggest challenge is the acoustically noisy scanner environment. The pulsing of the fMRI gradient coils is the source of such noise making the study of auditory processes challenging (Mueller et al. 2011; Edmister et al. 1999; Belin et al. 1999). For example, during echo-planar imaging (EPI) within a 4 T magnet using a high-performance head gradient set, sound levels can reach 130 dB. As a reference point, Food and Drug Administration (FDA) safety regulations require no greater than an average of 105 dB for 1 h. With the placement of absorbing materials within the scanner and on the walls of the room, as well as a fiberglass bore liner surrounding the gradient set, we were able to reduce sound levels by about 25 dB. One of the biggest technical challenges within an MRI scanner has been the ability to present olfactory stimuli. However, sophisticated MR compatible olfactometers have been designed and utilized successfully (Lowen and Lukas 2006; Sobel et al. 1997). Such methods use a nasal-mask in which the change from odorant to no-odorant conditions occurs within a few milliseconds.

Acquiring ancillary electrophysiological data such as electromyographic recordings to measure muscle contraction or electrodermal responses to measure autonomic activity enhances many cognitive neuroscience experiments. Devices have been developed that are MR compatible for these types of measurements as well other physiological measures such as heart rate, electrocardiography, oxygen saturation, and respiratory rate. The recording of eye movements is becoming commonplace in MRI scanners predominantly with the use of infrared video camera equipped with long-range optics (Reuter et al. 2010; Gitelman et al. 2000). Video images of the pupil-corneal reflection can be sampled at 60/120/240 Hz allowing for the accurate (< 1°) localization of gaze within 50 horizontal and 40 vertical degrees of visual angle. Although most behavioral tasks used in cognitive neuroscience experiments rely on collecting manual responses, the ability to reliably collect verbal responses without significant artifact introduced into the data has been demonstrated by several laboratories (Abraham et al. 2003; Palmer et al. 2001; Barch et al. 1999).

EEG recordings have also been successfully performed during MRI scanning (e.g., Scheeringa et al. 2009; Goldman et al. 2000). However, the recording of event-related potentials (ERP), a signal that is much smaller in amplitude than the signal in EEG, can be more difficult in a magnetic field due to artifacts induced by gradient pulsing and head movement from cardiac pulsation. New monitoring devices and algorithms to remove artifacts are being developed allowing for reliable measurements of ERPs during MRI scanning (Mantini et al. 2007; Otzenberger et al. 2007). In summary, most initial challenges facing performing cognitive experiments within the MRI environment have been overcome creating an environment that is comparable to standard psychophysical testing laboratories outside of a scanner. Although individual laboratories have achieved most of these advancements, MRI scanners originally designed for clinical use by manufacturers are now being designed with consideration of many of these research-related issues.

Temporal Resolution

Two types of temporal resolutions need to be considered for cognitive neuroscience experiments. First, what is the briefest neural event that can be detected as an fMRI signal? Second, how close apart can two neural events occur and be resolved as separable fMRI signals?

The timescale on which neural changes occur is quite rapid. For example, neural activity in the lateral intraparietal area of monkeys increases within 100 ms of the visual presentation of a saccade target (Gnadt and Andersen 1988). In contrast, the fMRI signal gradually increases to its peak magnitude within 4–6 s after an experimentally induced brief (< 1 s) change in neural activity and then decays back to baseline after several more seconds (Aguirre et al. 1998; Bandettini et al. 1992; Boynton et al. 1996) (see Chap. 8). This slow time course of fMRI signal change in response to such a brief increase in neural activity is informally referred to as the blood oxygenation level-dependent (BOLD) fMRI hemodynamic response or simply, the hemodynamic response (Fig. 18.1). Thus, neural dynamics and neurally evoked hemodynamics, as measured with fMRI, are on quite different timescales.

Fig. 18.1
figure 1

A typical hemodynamic response (i.e., fMRI signal change in response to a brief increase of neural activity) from the primary sensorimotor cortex. The fMRI signal peaked approximately 5 s after the onset of the motor response (at time zero)

The sluggishness of the hemodynamic response limits the temporal resolution of the fMRI signal to hundreds of milliseconds to seconds as opposed to the millisecond temporal resolution of electrophysiological recordings of neural activity. However, it has been clearly demonstrated that brief changes in neural activity can be detected with reasonable statistical power using fMRI. For example, appreciable fMRI signal can be observed in sensorimotor cortex in association with single finger movements (Kim et al. 1997) and in visual cortex during very briefly presented (34 ms) visual stimuli (Savoy et al. 1995). In contrast, the temporal resolution of fMRI limits the detection of sequential changes in neural activity that occurs rapidly with respect to the hemodynamic response. That is, the ability to resolve the changes in the fMRI signal associated with two neural events often requires the separation of those events by a relatively long period of time compared with the width of the hemodynamic response. This is because two neural events closely spaced in time will produce a hemodynamic response that reflects the accumulation from both neural events making it difficult to estimate the contribution of each individual neural event. In general, evoked fMRI responses to discrete neural events separated by at least 4 s appear to be within the range of resolution (Zarahn et al. 1997a). However, provided that the stimuli are presented randomly, studies have shown significant differential functional responses between two events (e.g., flashing visual stimuli) spaced as closely as 500 ms apart (Burock et al. 1998; Clark et al. 1997; Dale and Buckner 1997). The effect of fixed and randomized intertrial intervals on the BOLD signal is illustrated in Fig. 18.2.

Fig. 18.2
figure 2

Effect of fixed versus randomized intertrial intervals on the BOLD fMRI signal. (Burock et al. 1998)

The sluggish nature of the hemodynamic response is of particular concern when the order of individual trial events cannot be randomized, that is, for experiments in which switching the temporal order of stimuli would alter the meaning of the stimuli. In standard working memory tasks, the presentation of the information to be remembered during the delay period and the period when the subject recalls the information are individual trial events that must occur in sequence; recall of the information is not possible unless the information is first presented. Studies of cognitive control also often employ designs in which target stimuli are meaningless unless preceded by a cue stimulus. One approach that researchers have taken to measure the BOLD response evoked by each stimulus type is to separate the events sufficiently (> 4 s) so that the individual responses can be resolved (Rypma and D’Esposito 1999; Sakai and Passingham 2003). This approach requires lengthy trials, which may be undesirable for practical reasons (the experiment may run too long) or for reasons particular to the experiment (e.g., the experimenter wants to limit the amount of preparation subjects have to response to the second stimulus). Another approach involves mixing full trials, in which stimulus 1 is followed after a fixed interstimulus interval by stimulus 2, with partial trials, in which the trial terminates after the presentation of stimulus 1 (Ollinger et al. 2001a, 2001b; Ruge et al. 2009). With a sufficient proportion of partial trials (~ 20–25 %), the BOLD responses associated with each stimulus type can be identified, even when brief interstimulus intervals are used. Despite the inherent temporal spread of the hemodynamic response, researchers continue to develop methods that will improve our ability to resolve fMRI activity to closely co-occurring events.

Spatial Resolution

It is yet to be determined how precisely the measured BOLD fMRI signal, which arises from the vasculature, reflects adjacent neural activity. Thus, the ultimate spatial resolution of BOLD fMRI is unknown (see Chaps. 4, 8, and 26). Functional MRI studies in both monkey and man at high field (4–4.7 T) have demonstrated that BOLD signal can be obtained with high spatial resolution—approximately 0.75 × 0.75 mm2 in-plane resolution (Logothetis et al. 1999; Cheng et al. 2001). In monkeys, with novel approaches such as using a small, tissue-compatible, intraosteally implanted radiofrequency coil, ultra-high spatial resolution of 125 × 125 µm2 has been obtained (Logothetis et al. 2002). Using this method, Logothetis and colleagues demonstrated cortical lamina-specific activation in a task that compared responses to moving stimuli with those elicited by flickering stimuli. This contrast elicited BOLD signal mostly in the granular layers of the striate cortex of the monkey, which are known to have a high concentration of directionally selective cells. Advances in such methods would allow for imaging of hundreds of neurons per voxel as opposed to hundreds of thousands of neurons per voxel, which is more typical for a human cognitive neuroscience fMRI experiment.

Virtually all fMRI studies model the large BOLD signal increase, which is due to a local low-deoxyhemoglobin state, in order to detect changes correlating with a behavioral task. However, optical imaging studies have demonstrated that preceding this large positive response there is an initial negative response reflecting a localized increase in oxygen consumption causing a high-deoxyhemoglobin state (Malonek and Grinvald 1996). This early hemodynamic response is called the “initial dip” and is thought to be more tightly coupled to the actual site of neural activity evoking the BOLD signal as compared to the later positive portion of the BOLD response (see Chap. 8). For example, Kim and colleagues, scanning cats in a high-field scanner, demonstrated that the early-negative BOLD response (e.g., initial dip) produced activation maps that were consistent with orientation columns within visual cortex. This finding is quite remarkable given that the average spacing between two adjacent orientation columns in cortex is approximately 1 mm. In contrast, the activation maps produced by the delayed positive BOLD response appeared more diffuse and cortical columnar organization could not be identified (Kim and Duong 2002). Thus, empirical evidence suggests that deriving activation maps by correlating behavioral responses with the initial dip may markedly improved spatial resolution. However, it is important to note that observation of the initial dip of the BOLD signal has been inconsistently observed in humans across laboratories for reasons that are still unclear (see Uludağ dip comment in PNAS 2010). Several groups, however, were able to detect columnar architecture (in this case ocular dominance columns) by modeling the positive BOLD response in humans scanning at 4 T (Cheng et al. 2001; Menon et al. 1997). These investigators attributed their success to optimized radiofrequency coils, limiting head motion, optimizing slice orientation, and the enhanced signal-to-noise ratio (SNR) provided by a high magnetic field.

Another unique method for improving spatial resolution has been called functional magnetic resonance-adaptation (fMR-A), which could provide a means for identifying and assessing the functional attributes of sharply defined neuronal populations within a given region of the brain (Weigelt et al. 2008; Krekelberg et al. 2006; Grill-Spector and Malach 2001). Even if the spatial resolution of fMRI evolves to the point of being able to resolve a population of a few hundred neurons within a voxel, it is still likely that this small population will contain neurons with very different functional properties that will be averaged together. The adaptation method is based on several basic principles. First, repeated presentation of the same type of stimuli (i.e., a picture of the one object) causes neurons to adapt to those stimuli (i.e., neuronal firing is reduced). Second, if these neurons are then exposed to a different type of stimulus (i.e., a picture of another object) or a change in some property of the stimulus (i.e., the same object in a different orientation), recovery from adaptation can be assessed (i.e., whether or not the BOLD signal returns to its original state). If the signal remains adapted, it implies that the neurons are invariant to the attribute that was changed or if the signal recovers from the adapted state it would imply that the neurons are sensitive to that attribute. For example, Grill-Spector and colleagues demonstrated that an area of lateral occipital cortex thought to be important for object recognition was less sensitive to changes in object size and position as compared to changes in illumination and viewpoint (Grill-Spector et al. 1999). Thus, with this method, it is possible to investigate the functional properties of neuronal populations with a level of spatial resolution that is beyond that obtained from conventional fMRI data analysis methods.

Considering all the neuroscientific methods available today for studying human brain–behavior relationships, fMRI provides an excellent balance of temporal and spatial resolution. Improvements on both fronts will clearly add to the increasing popularity of this method.

Issues in Functional MRI Experimental Design

Numerous options exist for designing experiments using fMRI. The prototypical fMRI experimental design consists of two behavioral tasks presented in blocks of trials alternating over the course of a scanning session, and the fMRI signal between the two tasks is compared. This is known as a block design. For example, a given block might present a series of faces to be viewed passively, which evokes a particular cognitive process, such as face perception. The “experimental” block alternates with a “control” block, which is designed to evoke the same cognitive processes present in the experimental block except for the cognitive process of interest. In this experiment, the control block may comprise a series of objects. In this way, the stimuli used in experimental and control tasks have similar visual attributes but differ in the attribute of interest (i.e., faces). The inferential framework of “cognitive subtraction” (Posner et al. 1988) attributes differences in neural activity between the two tasks to the specific cognitive process (i.e., face perception). Cognitive subtraction was originally conceived by Donders in the late-1800s for studying the chronometric substrates of cognitive processes (see Sternberg 1969) and was a major innovation in imaging (Posner et al. 1988; Petersen et al. 1988).

The assumptions required for cognitive subtraction may not always hold and could produce erroneous interpretation of functional neuroimaging data (Zarahn et al. 1997a). Cognitive subtraction relies on two assumptions: “pure insertion” and linearity. Pure insertion implies that a cognitive process can be added to a preexisting set of cognitive processes without affecting them. This assumption is difficult to prove because one needs an independent measure of the preexisting processes in the absence and presence of the new process (Sternberg 1969). If pure insertion fails as an assumption, a difference in the neuroimaging signal between the two tasks might be observed, not because a specific cognitive process was engaged in one task and not the other, but because the added cognitive process and the preexisting cognitive processes interact.

An example of this point is illustrated in working memory studies using delayed-response tasks (Fuster 1997). These tasks (for an example, see Jonides et al. 1993) typically present information that the subject must remember (engaging an encoding process), followed by a delay period during which the subject must hold the information in memory over a short period of time (engaging a memory process), followed by a probe that requires the subject to make a decision based on the stored information (engaging a retrieval process). The brain regions engaged by evoking the memory process theoretically are revealed by subtracting the BOLD signal measured by fMRI during a block of trials that the subject performs that do not have a delay period (only engaging the encoding and retrieval processes) from a block of trials with a delay period (engaging the encoding, memory, and retrieval processes). In this example, if the addition or “insertion” of a delay period between the encoding and retrieval processes affects these other behavioral processes in the task, the result is failure to meet the assumptions of cognitive subtraction. That is, these “non-memory” processes may differ in delay trials and no-delay trials, resulting in a failure to cancel each other out in the two types of trials that are being compared.

Empirical evidence of such failure exists (Zarahn et al. 1999). For example, Fig. 18.3 demonstrates BOLD signal derived from the prefrontal cortex from a subject performing a delayed response task similar to the tasks described above. The left side of the figure illustrates BOLD signal consistent with delay period activity whereas the right side of the figure illustrates BOLD signal from another region of prefrontal cortex that did not display sustained activity during the delay yet showed greater activity in the delay trials as compared to the trials without a delay. In any fMRI study using a block design that compares delay versus no-delay trials with subtraction, such a region would be detected and likely assumed to be a “memory” region. Thus, this result provides empirical grounds for adopting a healthy doubt regarding the inferences drawn from imaging studies that rely exclusively on cognitive subtraction.

Fig. 18.3
figure 3

Data derived from the performance of a normal subject on a spatial delayed-response task (Zarahn et al. 1999). This task comprised both delay trials (circles) as well as trials without a delay period (no-delay trials; diamonds). a Trial averaged fMRI signal from prefrontal cortex that displayed delay-correlated activity. The gray bar along the x-axis denotes the 12 s delay period during delay trials. The delay trials display a level of fMRI signal greater than baseline throughout the period of time corresponding to the retention delay (taking into account the delay and dispersion of the fMRI signal). The peaks seen in the signal correspond to the encoding and retrieval periods. b Trial averaged fMRI signal from a region in prefrontal cortex that did not display the characteristics of delay-correlated activity. This region displays a significant functional change associated with the no-delay trials, and a significant functional change associated with the encoding and retrieval periods of the delay trials, but not the one associated with the retention delay of delay trials. BOLD blood oxygenation level-dependent

The transform between the neural signal and the hemodynamic response (measured by fMRI) must be linear for the cognitive subtractive method to yield valid results. In other words, it is assumed that the BOLD signal being measured is approximately proportional to the local neural activity that evokes it. Surprisingly, although thousands of empirical studies using fMRI to study brain–behavior relationships have been published, only a handful exist that have explored the neurophysiological basis of the BOLD signal (for a review see Attwell and Iadecola 2002; Heeger and Ress 2002). In several studies, linearity did not strictly hold for the BOLD fMRI system but the linear transform model was reasonably consistent with the data. For example, Boynton and colleagues tested whether BOLD signal in response to long duration stimuli can be predicted by summing the responses to shorter duration stimuli (Boynton et al. 1996). Using pulses of flickering checkerboard patterns and measuring within human primary visual cortex, these investigators found that the BOLD signal response to various durations of stimulus presentation (6, 12, or 24 s) could be predicted from the responses they obtained from shorter stimulus presentations. For example, the BOLD signal response to a 6 s pulse could be predicted from the summation of the BOLD signal response to the 3 s pulse with a copy of the same response delayed by 3 s. However, temporal summation did not always hold, and there are clearly nonlinear effects in the transform of neural activity to a hemodynamic response that must be considered (Friston et al. 1998; Glover 1999; Miller et al. 2001; Vazquez and Noll 1998). If these nonlinearities lead to saturation of the BOLD effect at specific stimulus intensities, erroneous interpretation of particular results of fMRI experiments may occur.

Another class of experimental designs, called event-related fMRI, attempt to detect changes associated with individual trials, as opposed to the larger unit of time comprising a block of trials (D’Esposito et al. 1999; Rosen et al. 1998). Each individual trial may be composed of one behavioral “event,” such as the presentation of a single stimulus (e.g., a face or an object to be perceived) or several behavioral events such as in the delayed response task described above (e.g., an item to be remembered, a delay period, and a motor response in a delayed-response task). For example, with an event-related design, activity within the PFC has consistently been observed during the delay period (Zarahn et al. 1999), supporting the role of the PFC in temporarily maintaining information. Event-related designs offer numerous advantages. For example, it allows for stimulus or trial randomization avoiding the behavioral confounds of block trials. It also permits the separate analysis of functional responses, which are identified only in retrospect (i.e., trials on which the subject made a correct or incorrect response). Of course, an experiment does not have to be limited to either a block or event-related designs—a mixed-type (both event-related and block) design where particular trial types are randomized within a block is perfectly feasible. In this type of design, both item-related processes (e.g., transient responses to stimuli), as well as state-related processes (processes sustained throughout a block of trials or a task (Donaldson et al. 2001; Mitchell et al. 2000)) can be measured.

Overall, much flexibility exists in the type of experimental design that can be utilized in fMRI experiments and continued innovation in this area will greatly expand the types of neuroscientific questions that can be addressed.

Issues in Interpretation of fMRI Data

Statistics

Many statistical techniques are used for analyzing fMRI data, but no single method has emerged as the ideal or “gold standard.” The analysis of any fMRI experiment designed to contradict the null hypothesis (i.e., there is no difference between experimental conditions) requires inferential statistics (see Chap. 12). If the difference between two experimental conditions is too large to be reasonably due to chance, then the null hypothesis is rejected in favor of the alternative hypothesis, which typically is the experimenter’s hypothesis (e.g., the fusiform gyrus is activated to a greater extent by viewing faces than objects). Unfortunately, since errors can occur in any statistical test, experimenters will never know when an error is committed and can only try to minimize them (Keppel and Zedeck 1989). Knowledge of several basic statistical issues provides a solid foundation for the correct interpretation of the data derived from fMRI studies.

Two types of statistical errors can occur. A type I error is committed when the null hypothesis is falsely rejected when it is true, that is, a difference between experimental conditions is found but a difference does not truly exist. This type of error is also called a false-positive error. In an fMRI study, a false-positive error would be finding a brain region activated during a cognitive task, when actually it is not. A type II error is committed when the null hypothesis is accepted when it is false, that is, no difference between experimental conditions exists when a difference does exist. This type of error is also called a false-negative error. A false-negative error in an fMRI study would be failing to find a brain region activated during the performance of a cognitive task when actually it is. The concept of a type II error is closely related to the idea of statistical power. If the false-negative rate for a given study design is 20 %, for instance, then the “power” of that design to detect an activation is “100–20 %” or 80 %.

In cognitive neuroscience studies, much emphasis has been placed on avoiding type I errors. The negative effects of incorrectly identifying a brain region as task-active include the expenditures of time, money, and effort spent in replicating and/or expanding upon a false-positive result. Type II error, on the other hand, is seen as less damning; failure to detect brain activity in a research study has fewer implications for future research, provided that one is careful to interpret so-called null results correctly. For example, cognitive neuroscience studies (due to factors such as the expense and the difficulty of finding research participants, for example) tend to employ a small number of subjects—15 would not be atypical—and therefore frequently lack power to detect significant brain activations. One must consequently be careful to avoid interpreting a lack of activation in one part of the brain as true inactivity during the task.

In fMRI experiments, like all experiments, a tolerable probability for type I error, typically less than 5 %, is chosen for adequate control of specificity, that is, control of false-positive rates. Two features of fMRI data can cause unacceptable false-positive rates, even with traditional parametric statistical tests. First, there is the problem of multiple comparisons. For the typical resolution of images acquired during fMRI scans, the full extent of the human brain could comprise as many as 15,000 voxels. Thus, with any given statistical comparison of two experimental conditions, there are actually 15,000 statistical comparisons being performed. With such a large number of statistical tests, the probability of finding a false-positive activation, that is, committing a type I error, somewhere in the brain increases. Several methods exist to deal with this problem. One method, a Bonferroni correction, assumes that each statistical test is independent and calculates the probability of type I error by dividing the chosen probability (p = 0.05) by the number of statistical tests performed. Another method is based on the Gaussian field theory (Worsley and Friston 1995) and calculates the probability of type I error when imaging data are spatially smoothed. Many other methods for determining thresholds of statistical maps are proposed and utilized (Everitt and Bullmore 1999; Genovese et al. 2002; Nichols and Holmes 2002) but unfortunately, no single method has been universally accepted. Nevertheless, all fMRI studies must apply some type of correction for multiple comparisons to control for the false-positive rate.

The second feature that might increase the false-positive rate is the “noise” in fMRI data. Data from BOLD fMRI are temporally autocorrelated with more noise at some frequencies than at others. The shape of this noise distribution is characterized by a 1/frequency function with increasing noise at lower frequencies (Zarahn et al. 1997b). Traditional parametric and nonparametric statistical tests assume that the noise is not temporally autocorrelated, that is, each observation is independent. Therefore, any statistical test used in fMRI studies must account for the noise structure of fMRI data. If not, the false-positive rates will inflate (Zarahn et al. 1997b; Aguirre et al. 1997).

Type II error is rarely considered in functional neuroimaging studies. When a brain map from an fMRI experiment is presented, several areas of activation are typically attributed to some experimental manipulation. The focus of most fMRI studies is on brain activation whereas it is often implicitly assumed that all of the other areas (typically most of the brain) were not activated during the experiment. Power as a statistical concept refers to the probability of correctly rejecting the null hypothesis (Keppel and Zedeck 1989). As the power of an fMRI study to detect changes in brain activity increases, the false-negative rate decreases. Unfortunately, power calculations for particular fMRI experiments are rarely performed, although this methodology is evolving (D’Esposito et al. 2000; Zarahn and Slifstein 2001; Van Horn et al. 1998). Reports that specific brain areas were not active during an experimental manipulation should provide an estimate of the power required for detection of a change in the region. All experiments should be designed to maximize power. Relatively simple strategies can increase power in an fMRI experiment in certain circumstances, such as increasing the amount of imaging data collected or increasing the number of subjects studied. It is also important to note that task designs can affect sensitivity (Aguirre and D’Esposito 1999). For example, since BOLD fMRI data are temporally autocorrelated, experiments with fundamental frequencies in the lower range (e.g., a boxcar design with 60 s epochs) will have reduced sensitivity, due to the presence of greater noise at these lower frequencies. Finally, in a study that simultaneously measured neural signal via intracortical recording and BOLD signal in a monkey, it was observed that the SNR of the neural signal was on average at least one order of magnitude higher than that of the BOLD signal. The investigators of this study concluded that “the statistical and thresholding methods applied to the hemodynamic responses probably underestimate a great deal of actual neural activity related to a stimulus or task” (Logothetis et al. 2001). Thus, the magnitude of type II error in BOLD fMRI may currently be underestimated and warrants further consideration in the interpretation of almost any cognitive neuroscience experiment.

Altered Hemodynamic Response

When comparing changes in fMRI BOLD signal levels within the brain of an individual subject across different cognitive tasks and making conclusions regarding changes in neural activity and the pattern of activity, numerous assumptions are made regarding the steps comprising neurovascular coupling (stimulus → neural activity → hemodynamic response → BOLD signal) and the regional variability of the metabolic and vascular parameters influencing the BOLD signal. It should be obvious that fMRI studies of cognition of individuals with local vascular compromise or diffuse vascular disease (e.g., patients with strokes or normal elderly) are potentially problematic. For example, many fMRI studies have sought to identify age-related changes in the neural substrates of cognitive processes. These studies that directly compare changes in fMRI BOLD signal intensity across age groups rely upon the assumption of age-equivalent coupling of neural activity to BOLD signal. However, there is empirical evidence that suggests that this general assumption may not hold true. Extensive research on the aging neurovascular system has revealed that it undergoes significant changes in multiple domains in a continuum throughout the human lifespan, probably as early as the fourth decade (for review see Farkas and Luiten 2001). These changes affect the vascular ultrastructure (Fang 1976), the resting cerebral blood flow (Bentourkia et al. 2000; Schultz et al. 1999), the vascular responsiveness of the vessels (Yamamoto et al. 1980), and the cerebral metabolic rate of oxygen consumption (Yamaguchi et al. 1986; Takada et al. 1992). Aging is also frequently associated with comorbidities such as diabetes, hypertension, and hyperlipidemia, all of which may affect the fMRI BOLD signal by influencing cerebral blood flow and neurovascular coupling (Claus et al. 1998). Any one of these age-related differences in the vascular system could conceivably produce age-related differences in BOLD fMRI signal responsiveness greatly affecting the interpretation of results from such studies.

Our laboratory compared the hemodynamic response function (HRF) characteristics in the sensorimotor cortex of young and older subjects in response to a simple motor reaction-time task (D’Esposito et al. 1999). The provisional assumption was made that there was identical neural activity between the two populations based on physiological findings of equivalent movement-related electrical potentials in subjects under similar conditions (Cunnington et al. 1995). Thus, we presumed that any changes that we observed in BOLD fMRI signal between young and older individuals in motor cortex would be due to vascular and not neural activity changes in normal aging. Several important similarities and differences were observed between age groups. Although, there was no significant difference in the shape of the hemodynamic response curve or peak amplitude of the signal, we found a significantly decreased SNR in the fMRI BOLD signal in older individuals as compared to young individuals. This was attributed to a greater level of noise in the older individuals. We also observed a decrease in the spatial extent of the BOLD signal in older individuals compared to younger individuals in sensorimotor cortex (i.e., the median number of suprathreshold voxels). Similar results have been replicated by two other laboratories (Buckner et al. 2000; Huettel et al. 2001). These findings suggest that there is some property of the coupling between neural activity and fMRI BOLD signal that changes with age.

In summary, comparing BOLD signal in two different groups of individuals that may differ in their vascular system should be done with caution (see Chap. 21). For example, in one scenario, a comparison of activation of young and elderly individuals during a cognitive task may show less activation by the elderly (as compared to young subjects) in some brain regions but greater activation in other regions (e.g., Rypma et al. 2001). In this scenario, it is unlikely that regional variations in the hemodynamic coupling of neural activity to fMRI signal would account for such age-related differences in patterns of activation. In another scenario, a comparison of young and elderly subjects may show less activation by the elderly (as compared to young subjects) in some brain regions, but no evidence of greater activation in any other region. In this case, it is possible that the observed age-related differences are not due to differences in intensity of neural activity, but rather to other nonneuronal contributions to the imaging signal, that is, neurovascular coupling.

Consequently, BOLD contrast methods yield signal changes that result from a complex mix of vascular effects and provide only relative, rather than absolute, measures. One approach to accounting for the influence of purely vascular effects is to directly measure regional and individual variability in vascular reactivity via a breath-holding task, which increases carbon dioxide concentration in the blood and leads to vascular dilatation (Wolf and Detre 2007). The task-related BOLD signal in each subject can then be corrected for particular region- and subject-specific vascular effects. An alternative functional neuroimaging approach, based on more direct measurements of cerebral blood flow to active brain areas, is known as arterial spin labeling (ASL) (see Chap. 11). In the various ASL techniques, the MRI scanner selectively magnetizes the flowing blood with a particular range of locations and/or velocities and then waits for the appearance of the magnetic “tag” in the downstream vessels. It thus becomes possible to obtain absolute measures of cerebral perfusion (Wolf and Detre 2007), thereby opening up the possibility of more quantitatively distinguishing between the differential influence of a disease on blood flow and its effect on brain activity (Brown et al. 2007). Additionally, relative to BOLD contrast, these absolute measurements appear to be more stable over long experiments (Aguirre et al. 2002), to show less between-subject and between-session variability (Liu and Brown 2007), and to produce decreased susceptibility artifact in areas such as the medial temporal lobe (Fernandez-Seara et al. 2007). The current major limitation is temporal resolution: one must both wait for the generation of sufficient magnetic label and also acquire two scans, a reference scan and a post-labeling scan, to produce a single data point. However, efforts are underway to improve this resolution. Research into so-called turbo ASL, for example, is attempting to reduce the time required to apply the magnetic tag and to optimize image acquisition with respect to the arrival time of the tag in downstream areas (Lee et al. 2007). A final potential disadvantage somewhat related to the temporal issues is the lower SNR of ASL relative to BOLD, but this decline may be compensated by the observation that ASL methods appear to be less variable across subjects (Brown et al. 2007).

Types of Hypotheses Tested Using fMRI

Functional neuroimaging experiments test hypotheses regarding the anatomical specificity for cognitive processes (functional specialization) or direct or indirect interactions among brain regions (functional integration). The experimental design and statistical analyses chosen will determine the types of questions that can be addressed. Ultimately, the most powerful approach for the testing of theories on brain–behavior relationships is the analysis of converging data from multiple methods.

Functional Specialization

The major focus of fMRI studies of cognition is testing theories on functional specialization. The concept of functional specialization is based on the premise that functional modules exist within the brain, that is, areas of the cerebral cortex are specialized for a specific cognitive process. For example, facial recognition is a critical primary function likely served by a functional module. Prosopagnosia is the selective inability to recognize faces. Patients with prosopagnosia, however, can recognize familiar persons, such as those of relatives, by other means, such as the voice, dress, or body shape. Other types of visual recognition, such as identifying common objects, are normal. Prosopagnosia arises from lesions of the inferomedial temporo-occipital lobe, which are usually due to a stroke within the posterior cerebral artery circulation. No lesion studies have precisely localized the area crucial for facial perception. However, they provide strong evidence that a brain area is specialized for processing faces. Functional imaging studies have provided anatomical specificity for such a module. For example, Kanwisher et al. (1997) used fMRI to test a group of healthy individuals and found that the fusiform gyrus was significantly more active when the subjects viewed faces than when they viewed assorted common objects. The specificity of a “fusiform face area” was further demonstrated by the finding that this area also responded significantly more strongly to passive viewing of faces than to scrambled two-tone faces, front-view photographs of houses, and photographs of human hands. These elegant experiments allowed the investigators to reject alternative functions of the face area, such as visual attention, subordinate-level classification, or general processing of any animate or human forms, demonstrating that this region selectively perceives faces.

Of course, the existence of brain areas specialized for certain functions does not exclude the strong possibility that those areas are part of larger networks. Recent neuroimaging work has focused on pattern classification methods (see Chap. 23)—that is, on techniques to explore whether a distributed spatial pattern of brain activity corresponds to object (or more abstract) representations (Pereira et al. 2009; Norman et al. 2008). This area of research, still in its infancy, draws on results from physics, computer science, and statistics, among other disciplines, to search for more broadly distributed structure in neuroimaging data. As such, the techniques themselves differ. For example, to distinguish between voxel activity patterns across experimental conditions, various reports have used correlations between the set of activations in visual responses to faces and other objects (Haxby et al. 2001); neural network classifiers to identify particular patterns correlated with particular memories (Polyn et al. 2005); and variants of a matrix algebra transformation known as singular value decomposition to look for distributed spatial correlates of memory storage and search (Zarahn et al. 2006). A large number of other techniques—too large to be reviewed here—are also being tested. As such research continues, this type of pattern classification will need to be validated via comparison with behavioral responses, in order to ensure that these patterns are not epiphenomenal (Williams et al. 2007).

Functional Integration

Functional neuroimaging experiments can also test hypotheses about interactions between brain regions by focusing on covariances of activation levels between regions (Buchel et al. 1999; McIntosh et al. 1996). These covariances reflect “functional connectivity,” a concept that was originally developed in reference to temporal interactions among individual neurons (Gerstein et al. 1978).

In addition to providing information about the specialization of various brain regions, functional neuroimaging can also address the interactions between brain regions that underlie cognitive processing. Understanding the various techniques that permit these types of analyses comprises a very active area of current research (Penny et al. 2004). However, most, if not all, of the techniques used to test for regional interactions are ultimately based on the covariance of activation levels in different brain regions across time—in other words, on the way in which activity levels in different areas of the brain rise or fall with relation to each other. Such statistical techniques are commonly known as “multivariate,” both because they rely on interactions between two or more brain areas and to distinguish them from the “univariate” methods applied in most tests of functional specialization.

Multivariate techniques can also be further subdivided into two types, determined by whether the method is designed to assess connectivity in a model-free (“functional connectivity”) or model-based (“effective connectivity”) fashion (see Chaps. 10 and 11). The former refers simply to methods that measure the temporal covariance in the activity between brain areas without a priori notions about which brain areas are relevant or how they should interact. Examples of model-free techniques would include correlation and its frequency-based analogue, coherence, which can be applied irrespective of hypotheses about the neural events that produced them. On the other hand, model-based, or effective connectivity, approaches begin with hypotheses about the interactions between different brain regions and attempt to support/refute them by evaluating the presence/absence of specific activity covariance patterns. Examples of these techniques would include structural equation modeling and dynamic causal modeling, both of which start by postulating the existence of influences (potentially complex, potentially time-varying) between specific brain regions. Both types of statistical techniques have value, of course; their use is determined by the problem at hand. Model-free approaches are more general and more easily deployed in exploratory analyses. However, they are not as powerful as model-based methods, which address specific hypotheses about how regions interact—but which fail if the model is mis-specified. Model-free methods, for example, may be more useful when attempting to determine which networks of brain areas might be involved in a task, whereas model-based methods may be most appropriate when the nodes of the network are known, and specific notions about how they interact need to be tested.

In our own laboratory, we have developed and used functional connectivity techniques to understand how brain interactions change under different task conditions and over time (Sun et al. 2004, 2005). For example, we have shown that functional connectivity changes as subjects learn a complex finger-tapping task (Sun et al. 2006). In the early phases of learning, the data show that subjects not only activate wide areas of primary sensorimotor cortex, premotor cortex, and the supplementary motor area but also that the coherence between these areas is increased relative to later stages. Such changes were not observed when subjects performed an already-learned motor skill; and more importantly, they were not found in the univariate responses, whose means were unchanged despite the changes in the subjects’ facility at the task. Similarly, in a working memory task for faces (Gazzaley et al. 2004), we have found an interesting dissociation between their univariate and multivariate analyses in the networks that support so-called delay period activity. In the task, subjects encoded a cue face, maintained the image across a delay of several seconds, and then decided whether a subsequently presented probe face matched the initial one. Interestingly, we found that despite a general decrease in the univariate activity from the cue to the delay period, there was a robust increase in the correlation between activity in the right fusiform face area and a diffuse set of brain regions including the frontal and parietal cortices.

In such known networks, effective connectivity techniques can be employed to more specifically evaluate the influence of the nodes of the network on each other. McIntosh and colleagues, for example, were able to exploit their own functional neuroimaging research on working memory networks to formulate a hypothesis about the interactions of the PFC, cingulate cortex, and other brain regions during task performance (McIntosh et al. 1996). Using structural equation modeling, the authors found shifting prefrontal and limbic interactions in a working memory task for faces as the retention delay increased (Fig. 18.4). The different interactions between brain regions at short and long delays were interpreted as a functional change. For example, strong corticolimbic interactions were found at short delays, but at longer delays, when the image of the face was more difficult to maintain, strong fronto-cingulate–occipital interactions were found. The investigators postulated that the former finding was due to maintaining an iconic facial representation and the latter due to an expanded encoding strategy, resulting in more resilient memory. As in our own previous studies, information that was not seen in the univariate analysis was captured by an approach sensitive to regional interactions. In addition to structural equation modeling, other approaches have been applied to fMRI datasets to capture information regarding the relative timing of activation across brain regions such as Granger causality, information analysis, and coherence (Fig. 18.5 and Sun et al. 2004, 2005; Fuhrmann Alpert et al. 2007).

Fig. 18.4
figure 4

Network analysis of fMRI data using structural equation modeling during performance of a working memory task cross three different delay periods (McIntosh et al. 1996). Areas of correlated increases in activation (solid lines) and areas of correlated decreases in activation (dotted lines) are shown. Note the different pattern of interactions among brain regions at short and long delays

Fig. 18.5
figure 5

Network analysis of fMRI data using coherence during the performance of a motor-learning task. Activity of some brain regions precedes activity in the region of interest whereas activity in other areas follows in time after activation of the region of interest

Cognitive Theory

An important role of studies using functional neuroimaging is to test theories of the underlying mechanisms of cognition. For example, one fMRI study (Rees et al. 1997) attempted to answer the question: “To what extent does perception depend on attention?” One hypothesis is that unattended stimuli in the environment receive very little processing (Treisman 1969), but another hypothesis is that the processing load in a relevant task determines the extent to which irrelevant stimuli are processed (Lavie and Tsal 1994). These alternative hypotheses were tested by asking normal individuals to perform linguistic tasks of low or high load while ignoring irrelevant visual motion in the periphery of a display. Visual motion was used as the distracting stimulus, because it activates a distinct region of the brain (cortical area MT or V5, another functional module in the visual system). Activation of area MT would indicate that irrelevant visual motion was processed. Although task and irrelevant stimuli were unrelated, fMRI of motion-related activity in MT showed a reduction in motion processing during the high-processing load condition in the linguistic task. These findings supported the hypothesis that the perception of irrelevant environmental information depends on the information-processing load that is currently relevant and being attended to. Thus, by the finding that perception depends on attention, this fMRI experiment provides insight regarding underlying cognitive mechanism.

Extending the Inferences Drawn from fMRI Studies

A typical cognitive neuroscience fMRI experiment will contrast conditions between two groups of subjects or within a group of subjects. This approach has yielded several innovative and promising discoveries about the relationship between neural activity and behavior; however, there are constraints on the inferences that can be made with standard fMRI experiments. Many of these limitations have been discussed in previous sections. This section describes methods that are increasingly being used by cognitive neuroscientists to supplement inferences supported by prototypical fMRI experiments. These approaches have helped researchers reach novel conclusions about the relationship between neural activity and cognition.

Individual Differences Approaches

An increasing number of fMRI studies are using individual differences to draw inferences about brain–behavior relationships. Studies relying on individual differences fall into two broad categories. The first category of studies measure fMRI activity during performance of a cognitive task and compare the degree of activation with concurrent behavioral measures of task performance, such as accuracy and reaction time. A second category of studies compares neural activity during task performance with separate behavioral or personality measures collected outside of the scanner, such as working memory span or IQ. The insight gained from observing the activity that covaries with behavioral or personality variables can significantly strengthen conclusions reached with standard fMRI analyses. For example, one study used fMRI to demonstrate that the activity in posterior parietal cortex correlated with the amount of information held in visual short-term memory (Todd and Marois 2004). This work suggested that the parietal cortex contributed to the retention of short-term information. A subsequent paper employing an individual differences approach demonstrated that the activity in posterior parietal cortex was correlated with individual differences in short-term memory capacity (Todd and Marois 2005). The convergent results using both the standard fMRI approach and the individual differences approach together provide stronger support for the notion that regions of parietal cortex support short-term memory than each study provided independently. Of course, studies of individual differences are not without their own set of caveats and methodological challenges. In particular, the large sample sizes required for meaningful interpretation of results may be prohibitive due to the cost of running a large number of subjects. For an excellent in-depth discussion of conceptual and methodological issues pertaining to individual differences in cognitive neuroscience experiments, refer to Yarkoni and Braver (2010).

While the study described above demonstrates how studies of individual differences can strengthen inferences regarding functional specialization, similar approaches have also informed hypotheses regarding functional integration and cognitive theories. One study exploring functional integration examined connectivity between regions of posterior cingulate cortex and medial prefrontal cortex during working memory (Hampson et al. 2006). Despite evidence that these regions are typically deactivated during cognitive tasks (Schulman et al. 1997), connectivity between these two regions was positively correlated with working memory performance. These results identified network activity that supports working memory, while providing novel evidence regarding posterior cingulate and medial prefrontal function that was not evident using standard fMRI approaches. A recent study in our laboratory (Cohen et al. 2010) explored existing theories of working memory that suggest that the maintenance of information over a delay involves similar neural dynamics as are engaged during the initial encoding of the information (Awh et al. 2006; D’Esposito 2007; Postle 2006). Subjects performed a delayed-recognition task requiring them to remember faces and scenes over a delay. fMRI activity during maintenance was compared to activity during stimulus encoding. Subjects evincing a high correspondence between activity during maintenance and activity during encoding performed better on the working memory task, while subjects with a low correspondence in activity between the two task periods performed poorly, supporting the notion that the integrity of working memory is dependent on the degree to which neural dynamics during maintenance recapitulate dynamics during stimulus encoding. Future innovative uses of this approach will no doubt continue to strengthen the inferences about brain–behavior relationships that can be made using fMRI.

Integration of Multiple Methods

The most powerful inferences about brain–behavior relationships are supported by analyzing converging data from multiple methods. There are several ways in which different methods can provide complementary data. For example, one method can provide superior spatial resolution (e.g., fMRI) whereas the other can provide superior temporal resolution (e.g., ERP). Also, the data from one method may allow for different conclusions to be drawn from it such as whether a particular brain region is necessary to implement a cognitive process (i.e., lesion methods) or whether it is only involved during its implementation (i.e., physiological methods). The following sections describe examples of such approaches.

Combined fMRI/Lesion Studies

The combined use of functional neuroimaging and lesions studies can be illustrated with studies of the neural basis of semantic memory, the cognitive system that represents our knowledge of the world. Early studies of patients with focal lesions supported the notion that the temporal lobes mediate the retrieval of semantic ­knowledge (McCarthy and Warrington 1994). For example, patients with temporal lobe lesions may show a disproportionate impairment in the knowledge of living things (e.g., animals) compared with nonliving things. Other patients have a disproportionate deficit in knowledge of nonliving things (Warrington 1984). These observations led to the notion that the semantic memory system is subdivided into different sensorimotor modalities, that is, living things, compared with nonliving things, are represented by their visual and other sensory attributes (e.g., a banana is yellow), while nonliving things are represented by their function (e.g., a hammer is a tool but comes in many different visual forms). The small number of patients with these deficits, along with the infrequency of anatomically circumscribed lesions, limits our ability to arrive at precise anatomical–behavioral relationships. However, fMRI studies in normal subjects can provide spatial resolution that the lesion method lacks (Thompson-Schill 2003).

These original observations regarding the neural basis of semantic memory conflicted with functional neuroimaging studies consistently showing activation of the left inferior frontal gyrus (IFG) during the retrieval of semantic knowledge. For example, an early cognitive activation PET study revealed IFG activation during a verb generation task compared with a simple word repetition task (Petersen et al. 1988). A subsequent fMRI study (Thompson-Schill et al. 1997) offered a fundamentally different interpretation of the apparent conflict between lesion and functional neuroimaging studies of semantic knowledge: left IFG activity is associated with the need to select some relevant feature of semantic knowledge from competing alternatives, not retrieval of semantic knowledge per se. This interpretation was supported by an fMRI experiment in normal individuals in which selection, but not retrieval, demands were varied across three semantic tasks. In a verb generation task, in a high-selection condition, subjects generated verbs to nouns with many appropriate associated responses without any clearly dominant response (e.g., “wheel”), but in a low-selection condition nouns with few associated responses or with a clear dominant response (e.g., “scissors”) were used. In this way, all tasks required semantic retrieval and differed only in the amount of selection required. The fMRI signal within the left IFG increased as the selection demands increased (Fig. 18.6). When the degree of semantic processing varied independently of selection demands, there was no difference in left IFG activity, suggesting that selection, not retrieval, of semantic knowledge drives activity in the left IFG.

Fig. 18.6
figure 6

Regions of overlap of fMRI activity in healthy human subjects (left side of figure) during the performance of three semantic memory tasks, with the convergence of activity within the left inferior frontal gyrus (white region) (Thompson-Schill et al. 1997). Regions of overlap of lesion location in patients with selection-related deficits on a verb generation task (right side of figure) with maximal overlap within the left inferior frontal gyrus. (Thompson-Schill et al. 1998)

To determine if left IFG activity was correlated with but not necessary for selecting information from semantic memory, the same task used during the fMRI study was used to examine the ability of patients with focal frontal lesions to generate verbs (Thompson-Schill et al. 1998). Supporting the earlier claim regarding left IFG function derived from an fMRI study (Thompson-Schill et al. 1997), the overlap of the lesions in patients with deficits on this task corresponded to the site of maximum fMRI activation in healthy young subjects during the verb generation task (Fig. 18.6). In this example, the approach of using converging evidence from lesion and fMRI studies differs in a subtle but important way from the study described earlier that isolated the face processing module. Patients with left IFG lesions do not present with an identifiable neurobehavioral syndrome reflecting the nature of the processing in this region. Guided by the fMRI results from healthy young subjects, the investigators studied patients with left IFG lesions to test a hypothesis regarding the necessity of this region in a specific cognitive process. Coupled with the well-established finding that lesions of the left temporal lobe impair semantic knowledge, these studies further our understanding of the neural network mediating semantic memory.

Combined fMRI/Transcranial Magnetic Stimulation Studies

Transcranial magnetic stimulation (TMS) is a noninvasive method that can induce a reversible “virtual” lesion of the cerebral cortex in a normal human subject (Bolognini and Ro 2010; Pascual-Leone et al. 1999). Using both fMRI and TMS provides another means of combining brain activation data with data derived from the lesion method. There are several advantages for using TMS as a lesion method. First, brain injury likely results in brain reorganization after the injury and studies of patients with lesions assume that the nonlesioned brain areas have not been affected whereas TMS is performed on the normal brain. Another advantage for using TMS is that it has excellent spatial resolution and can target specific locations in the brain whereas lesions in patients with brain injury are markedly variable in location and size across individuals. Such an approach can be illustrated in a recent investigation of the role of the medial frontal cortex in task-switching (Rushworth et al. 2002). In this study, subjects first performed an fMRI study that identified the regions that were active when they stayed on the current task versus when they switched to a new task. It was found that the medial frontal cortex is activated when switching between tasks. In order to determine if the medial frontal cortex was necessary for the processes involved in task-switching, the same paradigm was utilized during inactivation of the medial frontal cortex with TMS. Guided by the locations of activation observed in the fMRI study, and using an MRI guided frameless stereotaxic procedure, it was found that applying a TMS pulse over the medial frontal cortex disrupted performance only during trials where the subject was required to switch between tasks. TMS over adjacent brain regions did not show this effect. Also, the excellent temporal resolution of TMS allowed the investigators to stimulate during precise periods of the task determining that the observed effect was during the time when the subjects were presented a cue indicating they must switch tasks prior to the actual performance of the new task. Thus, combining the results from both fMRI and TMS, it was concluded that medial prefrontal cortex was essential for allowing individuals to intentionally switch to a new task.

Recently, some groups have begun to perform TMS studies not only as an adjunct to, but also concurrently with, fMRI. The advantage of this approach is clear: applying TMS at various times during (rather than after) fMRI scans permits it to be causally linked with functional changes in the brain, even independently of behavior. In a recent study employing this technique, Ruff and colleagues (Ruff et al. 2006, 2007) examined the influence on early visual cortex of a parietal region (the anterior intraparietal sulcus, or aIPS) implicated in the generation of both covert spatial attention and eye movements. They chose a range of TMS stimulus intensities, all of which were thought to be in an effectively stimulatory rather than inhibitory range, and applied them to the aIPS while subjects fixated at the center of a viewing screen. On some trials, a randomly moving visual stimulus was present; subjects had no other task than to maintain fixation. Using this approach, the authors were able to demonstrate a parametric, so-called top-down effect from aIPS following TMS—an increase in the BOLD response in early visual cortex with increasing TMS intensity—that could be found only when visual stimuli were absent and that did not vary with retinotopic eccentricity. In distinction, their previous work (extended here) had shown that TMS of the frontal eye field (FEF) led to a decrease in BOLD response in the central visual field but to an increase in BOLD response in the peripheral visual field, irrespective of the presence or absence of a visual stimulus. The authors were consequently able to conclude that the aIPS and FEF have distinct top-down effects on visual cortex, a finding that would not have been possible without concurrent TMS.

Combined fMRI/Event-Related Potential Studies

The strength of combining these two methods is coupling the superb spatial resolution of fMRI with the superb temporal resolution of ERP recording. An example of such a study was reported by Dehaene and colleagues, who asked the question: “Does the human capacity for mathematical intuition depend on linguistic competence or on visuospatial representations?” (Dehaene et al. 1999). In this study, subjects performed two addition tasks—one in which they were instructed to select the correct sum from two numerically close numbers (exact condition) and one in which there were instructed to estimate the result and select the closest number (approximate condition). During fMRI scanning, greater bilateral parietal lobe activation was observed in the approximation condition as compared to the exact condition. Since this activation was outside the perisylvian language zone, it was taken as support that visuospatial processes were engaged during the cognitive operations involved in approximate calculation. Left-lateralized frontal lobe activation was observed to be greater in the exact condition as compared to the approximate condition, which was taken as evidence for language-dependent coding of exact addition facts. In order to consider an alternative explanation of the fMRI findings, the investigators also performed an ERP study. The alternative explanation was that in both the exact and approximate tasks, subjects would compute the exact result using the same representation for numbers but later processing, when they had to make a decision as to the correct choice, was what led to the differences in brain activation. Since fMRI does not offer adequate temporal resolution to resolve these two behavioral events that occur on brief timescale, ERP was the appropriate method to test this hypothesis. In the ERP study, it was demonstrated that the evoked neural response during exact and approximate trials already differed significantly during the first 400 ms of a trial before subjects had to make a decision.

Combined fMRI/Pharmacological Studies

Combining pharmacological challenges during the performance of cognitive tasks during fMRI scanning may yield significantly different information than either method alone (see Chap. 21). In isolation, fMRI cognitive task paradigms provide little information with respect to the underlying pharmacologic systems involved in cognition. On the other hand, drug administration without a brain measure cannot determine underlying neural mechanisms of the effects of neuromodulatory systems on cognition. Combining the two approaches allows the potential of probing the pharmacologic bases of behavior. One may measure the interactive effects of drug (compared to placebo or a range of doses) with cognitive task-related modulation of brain activity. It is fair to infer that drug x task interactions reflect modulation of the underlying anatomical and chemical brain systems and do not simply reflect nonspecific vascular effects. For example, dopaminergic agonists have been shown to have task-specific effects (Gibbs and D’Esposito 2005a, 2005b, 2006): different component processes of working memory are differentially affected by a dopaminergic drug, with effects that may differ between individuals depending on their baseline state. We have shown that a dopamine agonist improved the flexible updating (switching) of relevant information in working memory (Cools et al. 2007). However, the effect only occurred in individuals with low working memory capacity but not in individuals with higher working memory capacity. This behavioral effect was accompanied by dissociable effects of the dopaminergic agonist on frontostriatal activity. The dopamine agonist modulated the striatum during switching but not during distraction from relevant information in working memory while the lateral frontal cortex was modulated by the drug during distraction but not during switching.

Conclusions

Functional MRI is an extremely valuable tool for studying brain–behavior relationships, as it is widely available, noninvasive, and has superb temporal and spatial resolution. New approaches in fMRI experimental design and data analysis are appearing in the literature at an almost exponential rate, leading to numerous options for testing hypotheses on brain–behavior relationships. Combined with information from other complimentary methods, such as the study of patients with focal lesions, healthy individuals with transcranial magnetic stimulation, pharmacological intervention, or event-related potentials, data from fMRI studies provide new insights regarding the organization of the cerebral cortex, as well as the neural mechanisms underlying cognition.