Key words

1 Introduction

While recent brain imaging work has expanded our understanding of the mechanisms of perceptual, cognitive, and motor functions in human subjects, research into the cerebral control of emotional and motivational functions has been less intense. For several years, however, a growing body of fMRI and positron emission tomography (PET) work has been assessing the cerebral correlates of emotional activation, perception, learning and memory, and emotional regulation behavior in healthy humans [16].

Current brain imaging work is based on the concepts and hypotheses of the multidisciplinary field of “affective neuroscience” [79]. The endeavors of the subdisciplines of affective neuroscience have not only complemented but also promoted each other, stimulating a rapid growth of knowledge in the functional neuroanatomy of emotions. It is increasingly recognized that these areas also share similar methodological problems.

The expanding area of emotion neuroimaging has provided new methods for validating neurocognitive models of emotion processing that are crucial for many areas of research and clinical application. Progress is being made in disentangling the cerebral correlates of interindividual differences, personality, as well as of abnormal conditions such as, for example, anxiety, depression, psychoses, and personality disorders [1012]. It has been recognized that psychological assessment, categorization procedures, and psychotherapy treatment may profit from models that integrate functional connectivity information. The relevance and usefulness of valid neurocognitive models of emotion processing have recently been recognized by many fields of applied research such as, for example, psychotherapy research [13], criminology [14], as well as areas such as “neuroeconomics” [15] and “neuromarketing” [16].

Several human lesion studies have pointed to the deficits of neurological patients in recognizing emotions in faces, particularly often for the decoding of fearful faces especially after bilateral amygdala damage [1720]. Other studies have reported impairments not only for fear but also for other negative emotions such as anger, disgust, and sadness [21, 22]. Recent functional imaging studies have confirmed the importance of the amygdala in emotion processing. Due to the multiple connections between the amygdala and various cortical and subcortical areas, and the fact that the amygdala receives processed input from all the sensory systems , its participation is essential during the initial phase of stimulus evaluation [23]. The appraisal function of the amygdala, combining external cues with an internal reaction, reflects the starting point for a differential emotional response and is hence the basis for emotional learning. Involvement of the amygdala during classical conditioning especially during the initial stages of learning [24, 25] as well as during processing signals of strong emotions has been documented repeatedly with fMRI. However, a problem in verifying amygdala activation with neuroimaging tools may be the rapid habituation of its responses [10, 26].

Although the need for brain imaging data is not unequivocally acknowledged by all researchers in their specialties, the increasing body of neuroimaging data has value in challenging and constraining existing theories. Followers of cognitive emotion theory must face the fact that their results need to be compatible with or at least not contradict with established neuroscience (neuroimaging) findings [27]. However, to appropriately evaluate and integrate this knowledge, it is necessary to deal with the basic methodological problems of the field.

Therefore, this chapter is organized around the two major issues of the neuroimaging of emotional function. First, it addresses underlying conceptual issues and difficulties associated with operationalizing and measuring emotion (for more detailed reviews, see Refs. [2831]). The problems and limitations of brain imaging work that are associated with measurement precision, response scaling, reproducibility, as well as validity and generalizability are discussed corresponding to general principles of behavioral research [32].

Second, the complexities of neuroimaging methods are examined to supplement recent quantitative meta-analyses (for a summary of findings of the emotional neuroimaging literature, see Refs. [2, 4, 5]). We raise here some grounds for reflection about current measurement in neuroimaging of emotions, and to encourage the adoption of recent methodological advances of fMRI technology. In summary, it is suggested that additional interdisciplinary efforts are needed to advance measurement quality and validity, and to accomplish an integration of brain imaging technology and neuropsychological assessment theory.

2 Psychological Methods

2.1 Emotion Theories and Constructs

2.1.1 Definitions

Emotions have been defined as episodes of temporarily coupled, coordinated changes in component functions as a response of the organism to external or internal events of major significance. These component functions entail subjective feelings, physiological activation processes, cognitive processes, motivational changes, motor expression, and action tendencies [33, 34]. Emotions represent functions of fast and flexible systems that provide basic response tendencies for adaptive action [35].

Emotions can be differentiated from mood changes (extended change in subjective feeling with low intensity), interpersonal stances (affective positions during interpersonal exchange), attitudes (enduring, affectively colored beliefs, preferences, and predispositions toward objects or persons), and personality traits (stable dispositions and behavior tendencies) [29, 34].

The frequently used concept of “emotional activation” characterizes a relatively broad class of physiological or mental phenomena (e.g., strain, stress, physiological activation, arousal, etc.). It can be specified with respect to a variety of dimensions such as valence (quality of emotional experience), intensity or arousal (global organismic change), directedness (motivational and orientating functions), and selectivity (specific patterns of change) [36]. In contrast, the terms emotional reactivity or arousability, and psychophysical reactivity refer to the dispositional variability of the above activation processes under defined test conditions [37, 38].

Environmental objects possess a latent meaning structure of emotional information, which is represented by a hierarchy of constructs with relatively fixed intra- and interclass relations [29, 39, 40]. Accordingly, physical stimulus properties or surface cues serve as a basis for “universal” emotion categories such as happiness, surprise, fear, anger, sadness, and disgust that originate at a primary level [41]. On a secondary level, dimensions such as valence and arousal arise from the preceding levels [42]. Table 1 suggests a potential structure of emotional concepts or domains that integrates both discrete (primary) and secondary emotions [29].

Table 1 Hierarchical organization of emotion concepts (modified from [29])

Emotional activation has also been characterized as a process with a sequence of stages [29, 35]: following an initial evaluation of novelty, familiarity, and self-relevance, a stimulus object or context is fully encoded. This involves detection of physical stimulus features, recognition of object identity, and identification of higher-order emotional dimensions such as pleasantness or need significance. During the subsequent stages, cognitive appraisal processes are initiated to evaluate the significance of the event. These evaluation checks include an appraisal of whether the stimulus is relevant for personal needs or achieving certain goals. Finally, the potential to overcome or cope with the event and the compatibility of behavior with the self or social norms is evaluated [35].

2.1.2 Operationalization

The measurement of emotions crucially depends on an appropriate operationalization of the construct of interest and definition of response parameters. Such considerations have typically been elaborated in the context of psychological assessment theory [30, 38, 43]. The latter explains how psychological and physiological measures can be empirically assessed, decomposed, and used as indicators of the psychological constructs of interest. It organizes the assumptions concerning measurement, segmentation, and aggregation of activation measures, and evaluates the distribution characteristics and reliability of the data. It also determines the range of the construct of interest by localizing it according to variables, subjects or settings/situations, or combinations of these sources of variation. Since most current operationalizations are confined to one of these aspects, the range of conclusions to be drawn from the findings is also limited.

A particular problem associated with measuring emotional reactions is a certain lack of covariation of response measures. A frequent finding is that the expected synchronization of verbal, motor, and physiological response systems during an emotional episode is the exception rather than the rule. Although emotional episodes supposedly give rise to a synchronization of central, autonomic, motor, and behavioral variables [44], most emotional response measures only show imperfect coupling [45]. This response incoherence may be attributed to a temporary decoupling or dissociation of function [46]. This has led authors to suggest a triple response measurement strategy that suggests a multimodal assessment of emotion including responses in the verbal, gross motor, and physiological (autonomic, cortical, neuromuscular) response systems [47].

Research on human emotion has illustrated how the broad concept of emotion is subdivided into several component functions that dynamically interact during an emotional episode. Diverse operationalizations have been suggested to assess these subconstructs, many of which are highly correlated and form clusters or families of similar functions. Emotional activation processes are embedded in a multicomponential system of situational and personal determinants. Factors that shape the level and pattern of the emotional activation process are the following [29, 48]: the functional context of the task (e.g., cognitive processing, motor responses, autonomic functions, etc.); the direction and extension of effects (e.g., global versus selective activation); the intensity and the degree of emotional strain (e.g., low, middle, or traumatic intensity; degree of threat; intensity of physical/mental load; stimulus intensities below or above threshold); the time characteristics (e.g., duration, structure, and variability of a stimulus; effects of stimulus repetition or pre-exposure); the informational content (e.g., the degree of information and dimensions inherent in the experimental stimuli such as emotional valence or arousal, preparedness, novelty, safety, predictability, contingency information, etc.); the implications for action (conduciveness, implications for instrumental reactions; artificial vs. realistic nature of the procedure); the coping potential (e.g., active coping vs. passive enduring, degree of controllability, helplessness, social support, specific coping strategies); and, compatibility with self or social norms (e.g., personal relevance).

These different aspects have led to a large number of operationalizations. These include procedures to elicit orienting or startle reactions, basic emotions or “stress,” as well as stimulus-response paradigms and conditioning procedures. For example, one such standard procedure is to elicit orienting reactions (OR) by emotionally meaningful stimuli. The OR is a nonassociative process being modulated by excitatory (sensitization) and inhibitory (habituation) mechanisms. Pavlovian (classical) or instrumental conditioning of excitatory or inhibitory reactions has traditionally been investigated in autonomic reactions (cardiovascular, vasomotor, and electrodermal conditioning), motor responses (eye blink), and endocrine or immune system reactions [1].

Emotional experience is strongly influenced by cognitive activities which modulate attention and alertness (avoidance and escape), vigilance processes (information search and problem solving), person–situation interactions (denial, distancing, cognitive restructuring, positive reappraisal, etc.), and actions, which change the person–environment relationship [49]. Coping research has identified typical cognitive strategies to regulate arousal during an emotional episode such as rejection (venting, disengagement) and accommodation strategies (relaxation, cognitive work) [50]. Cognitive activities subsume engagement (reconceptualization, reevaluation strategies such as rationalization or reappraisal) and distraction techniques.

These behavioral and cognitive regulation processes have been studied for many decades [51]. This research has shown that the outcome of coping processes crucially depends upon the valence, ambiguity, controllability, and changeability of a stressor. Input-related regulation (denial, distraction, defense, or cognitive restructuring; [52]) or antecedent-focused regulation (selection, modification, or cognitive restructuring of situational antecedents; [53]) have been differentiated from response-focused processes (suppression of expressive behavior and physiological arousal; [53]).

While the behavioral procedures mentioned above are mostly unstandardized, a vast number of standardized psychometric instruments are available to assess the higher-order emotional processes (for a review see Ref. [31]). Questionnaires are the most frequently used method, being followed by behavior ratings by experts or significant others. However, these data assess subjective representations, that is, personal constructs and may be obscured by biased responding

2.2 Research Design and Validity

2.2.1 Research Design

The requirements for experimental research [32] are not always fulfilled by many early research designs of emotional neuroimaging work. This is typical for the pilot stage of scientific progress. In many cases, only preliminary or correlational interpretations are possible due to incomplete or missing control conditions (e.g., with respect to the “awareness” of emotional stimuli; [54]). In contrast, more recent work increasingly makes use of full factorial designs or applies parametric variations of the independent variable [55]. Moreover, new techniques of covariance analysis are available to explore the causal predictive value of structural data on emotional brain activation. The relationship of structural and functional connectivity data has been explored by means of Structural Equation Modeling [5658] and Dynamic Causal Modeling [59]. Moreover, functional brain imaging has been successfully combined with the lesion approach to elucidate the modulating influences of interconnected brain regions [60]. Thus, by means of appropriate research plans and advanced techniques of analysis, an “effective connectivity” can be identified that elucidates the causal relations of one neural system to another [61]. For example, the functional connectivity of the prefrontal cortex (PFC) that is supposed to modulate amygdala activity [62] might thus be better evaluated in terms of causality.

To avoid operationalization errors, the quality of the emotion induction procedure needs to be scrutinized, that is, it must be evaluated whether the intended emotion has actually been elicited. For example, since a variety of emotional and nonemotional stimulus situations may trigger amygdala activations [5], it is necessary to evaluate whether the intended emotion (such as fear) has actually been elicited. Since subjective report is not always an appropriate manipulation check, additional psychophysiological criteria are needed to validate the intended emotion. Sympathetic activity as indexed by electrodermal activity (EDA) has been assessed during imaging procedures for this purpose. Nevertheless, this does not validate fear since skin conductance responses represent the endpoint of many different processes [63].

2.2.2 Construct Validation

Brain imaging work implements specific neuropsychological construct validation strategies by associating behavioral measurement of emotion with functional brain activation data for different localizations [31]. Here, functional (physiological) data are related to but still remain categorically distinct from the psychological data that emerge from a particular behavioral paradigm. During the process of construct validation, indicators of connectional or neurophysiological constructs are related to the indicators of psychological constructs. Thus, different operationalizations of a certain psychological construct (procedures or task) are expected to be correlated with activations of a certain area or cluster of areas. A different construct is expected to correlate with another but not the previous area and vice versa. This corresponds to the double dissociation approach, which inspects task by localization interactions. This process of neuropsychological concept formation typically starts at a relatively broad level and proceeds downward in the above hierarchy finally specifying within-systems localization constructs [64].

However, depending on limitations of the measurement device described below, the reliability of psychological or activation data declines at lower levels of structural constructs complicating this validation process. The diverse validation attempts typically draw upon convergent or divergent associations of constructs that are located at quite different levels of generality. However, successful construct validation very much depends upon whether brain activation and psychological measures are analyzed on the same level of generality. In cases of asymmetry, low relationships may result that provoke misinterpretations and confuse the validation process. Thus, successful construct validation in the affective neurosciences requires emotional constructs and brain activation data to be measured on the same (symmetrical) level of generality or aggregation [31].

Emotional neuroimaging is typically guided by neuropsychological construct validation strategies. Here, the constructs are operationally defined by the complementary methods of emotion psychology and of neurophysiology. Both construct types are embedded in hierarchically organized networks with lower- and higher-order levels of generality. Both types of data are associated with each other during validation. However, it is necessary to define neural and emotional constructs on the same level of generality. For example, when a relatively broad behavioral category or set of functions (“emotion regulation”) is being associated with isolated cerebral substructures, the relationship is likely to be asymmetrical and disappointing low correlations might result confusing the validation process.

2.2.3 Internal Validity

FMRI is known to be a highly reactive measure because the scanner setting (gradient noise and the supine position) causes the subject to respond to the experimental situation as a stressor. Unless habituation sessions are included in the procedure, tonic stress and arousal effects may be induced that modulate responding as discussed above. For example, a decreasing rate of response of the amygdala to a conditioned stimulus during the late phase of acquisition [10, 24, 26, 65] may also be attributable to testing effects (sensitization to the setting, acquaintance with the procedure, and type of unconditioned stimulation) rather than fast amygdala habituation per se (other factors might also explain reduced amygdala perfusion measures such as potential ceiling effects, baseline dependencies, and regression to the mean). In general, familiarity with emotionally activating procedures in the scanner induces states of expectation, sensitizing or desensitizing effects that may confound follow-up measurement. In addition to these testing effects, history, that is, occurrences other than the treatment and individual experiences between a first and a second measurement are likely to endanger the assessment of emotion (e.g., when assessing psychotherapy effects).

Changes in the observational technique, the measurement device or sequence and other instrumentation effects may also obscure emotion-related treatment variance during an fMRI session or across sessions. From the discussion of MR methods it is clear that longitudinal changes of measurement precision are also to be expected from inconsistent acquisition geometry and shim, as well as system instabilities and hardware changes.

It is well known from psychophysiological research that the interpretation of repeated measurement factors is complicated by initial value dependencies [66]. When the hemodynamic response is fitted relative to the prestimulus baseline, a physiological or statistical dependency of tonic perfusion levels and the phasic reaction may prevail [67]. While the first experimental blocks may show extreme effects, subsequent measurements are likely to be closer to the mean. Moreover, it has been pointed out above that the reliability of blood oxygen level-dependent (BOLD) measurements may be compromised by distortions or signal loss. When emotional paradigms with inconsistent effects are used or when subjects with an extreme variability of emotional responsivity are investigated, experimental effects are likely to show “regression to the mean.”

Subjects change as a function of time and these maturation effects may occur during the time range of the experiment (psychophysiological changes of organismic state or psychological stance, in particular during aversive paradigms). State-dependent influences or maturation effects may hamper within-subject replication or evaluations of long-term psychotherapy effects.

Subject groups with an elevated emotionality are more likely to show greater dropout rates in stressful experiments, that is, subjects of one group drop out as a consequence of their specific reactivity to the emotionally strain of the challenge paradigm. If exit from an emotionally activating study is not random, this effect of “experimental mortality” may confound comparison between groups.

Selection effects, that is, group differences from the outset of the study, are likely in functional imaging studies with very small numbers of participants. Selective recruitment of volunteers or drop out of participants may lead to decreased reactivity and lower emotionality in the remaining study group. Poor recruitment techniques (e.g., drafting subjects from the social circle of the lab partially acquainted with the procedures) or lack of random assignment to groups may further limit the validity of emotional fMRI studies.

Interactions of selection with maturation may occur when groups that differ with respect to maturation processes are compared (e.g., administering a social stress test for cortisol stimulation at different times of the day). Gender, personality traits, or psychopathology are all associated with specific individual differences of emotional regulation behavior. When these behaviors change over time as a function of personal development, follow-up measurements may be confounded by this type of effect. Thus, poor randomization or lack of control of personality-specific variance may jeopardize brain activation studies of emotional behavior. Finally, an interaction of selection with instrumentation occurs when experimental subjects and controls show pre-experimental differences with respect to the shape of their responses such as floor or ceiling effects.

In general, emotional responses show an intraindividual instability due to measurement artifacts (see Sect. 3), state-dependent influences, or characteristics of the subject (age, gender, experience, temperament) all impose additional effects on functional neuroimaging results [5]. A considerable degree of within- and between-subject variation in the time course of emotional responding depends on habitual, subject-specific mechanisms. First, the phasic activation pattern reflects the short-term modulation in response to the emotional stimulus. Due to the temporal within-trial variability of BOLD responses in different brain regions, averaging across subjects may obscure the detection of activation in a specific region and reduce effect sizes specifically for higher-level reactions. Second, activation also varies across the time course of the experiment. Most subjects show a constant increase in autonomic arousal depending on the degree of emotional stimulation. This is not only accompanied by a systemic response (tonic increase of sympathetic activation including blood pressure, cardiac contractility, and variability), but also by variations of tonic perfusion. These changes may show divergent trends for cortical and limbic regions imposing an unknown error on the measurement of the phasic BOLD reaction. These tonic and phasic variations appear to reflect the subject-specific mechanisms of emotional regulation behavior.

2.2.4 External Validity and Generalizability

2.2.4.1 Generalization to Other Procedures and Paradigms

The majority of current paradigms have focused on lower-level perceptual or learning processes pertaining to basic or secondary emotional categories. Since the results depend on the selected task parameters (degree of induced arousal, hedonic strength, and motivational value; degree of involvement of memory processes; reinforcement schedule; conditioning to cues or contexts; etc.), a comparison with and generalization to other operationalizations remains difficult. Systematic neuroimaging approaches to higher-level appraisal processes are still sparse. These involve evaluations of the motivational conditions and coping potential, that is, the ability to overcome obstructions or to adapt to unavoidable consequences [29]. An expanded range of constructs would involve an assessment of social communication processes, beliefs, preferences, predispositions, high-level evaluation checks, as well as modulating sociocultural influences. Higher-order appraisal processes involve the evaluation of whether stimulus events are compatible with social standards and values or with the self-concept. Another function to be explored concerns the degree to which a stimulus event may increase, decrease, or even block goal attainment or need satisfaction, and activate a reorientation of the individual’s goal/need hierarchy and behavioral planning (goal/need priority setting) [29].

Whereas frontostriatal mechanisms of motor control have been increasingly investigated, recent work has made efforts toward developing an understanding of how emotion and motivation are linked to the frontal mechanisms controlling the preparation and execution of behavior [68, 69]. Behavior preparation and execution represent closely integrated components within an emotional episode. Mobilization of energy is required to prepare for a certain class of behavior. Action planning and motor preparation requires sequencing of actions and generation of movements. However, an emotion preceding behavior is only one of a number of factors, including situational pressures, strategic concerns, or instrumentality, involved in eliciting the concrete action. Additional research is needed to trace the information flow from motivational to motor systems.

Another component is the verbal or nonverbal communication of emotions such as facial expression or vocal prosody [70]. The ability to verbally conceptualize emotions and to communicate emotional experiences plays an important role in the regulation of an ongoing emotional episode. For example, explicit emotion-labeling tasks have been shown to decrease the activation level of the amygdala [71, 72].

Finally, sociocultural factors may shape attitudes (relatively enduring, affectively colored beliefs, preferences, and predispositions toward objects or persons) as well as interpersonal stances (affective stance taken toward another person in a specific interaction). The ability of the individual to form representations of beliefs, intentions, and affective states of others has a considerable importance for affective and interpersonal interaction. However, the effects of beliefs, preferences, and predispositions on lower levels of emotional responding have attracted little attention. Top-down processes may induce considerable variations of task and stimulus parameters by modulating lower-level automatic processes and by controlling the late behavior preparation stages during the emotional process. Thus, generalization to other paradigms and constructs has limitations because higher-level behavioral and cognitive strategies that are part of the individual emotion regulation system ([50]; see later) modulate the emotion process.

2.2.4.2 Generalization to Other Subjects and Populations

The study groups of many fMRI studies have been relatively small and poorly described with respect to personality dimensions. Since several studies provide evidence for trait-dependent differences in responding [7376], it remains unclear to what extent the results may have been influenced by interindividual differences of the participating subjects. The representativeness of results is particularly poor if members of the social circle of the lab serve as participants instead of independently recruited participants. Thus, when the effects of an emotional paradigm interact with characteristics of the study groups (such as a low level of emotionality in subjects willing to participate in an activating scanning condition), this selection × treatment effect may endanger generalizations to other populations.

2.2.4.3 Generalization to Other Times and Settings

The prediction of future emotional or psychopathological disorders on the basis of emotional behavior assessed in the scanner remains difficult [77]. Eliciting emotions in the imaging scanner is a highly artificial situation. It remains unclear to what extent these results can be generalized to other settings and, in particular, to real life settings. Small and Nusbaum [78] have criticized the unnatural MRI scanner setting and suggested an “ecological functional brain imaging approach” that includes monitoring of natural behaviors using a multimodal assessment and environmental context of presentation or behavior. Nevertheless, in contrast to the scanner, emotion in real settings is not restricted to simple reactions but includes the full range of regulatory actions. By correlating fMRI and field data, such as, for example, generated by emotion monitoring during everyday life [79], the “ecological validity,” that is, the predictive value of cerebral perfusion patterns for real-life emotions could be better evaluated.

3 fMRI Methods

3.1 Methodological Challenges

3.1.1 Introduction

A host of fMRI studies have identified the amygdalae as central structures in emotion processing (see Sect. 1 and Zald et al. [5], for example, for a review). The amygdalae lie in the anterior medial temporal lobe (MTL), bounded ventrolaterally by the lateral ventricles and medially by the sphenoid sinuses (Fig. 1). The differing magnetic susceptibilities of these tissues cause large deviations in the static magnetic field, B 0. There is also a strong gradient in B 0 in the MTL, and differing precession frequencies lead to dephasing of the bulk magnetization and loss of signal in images. This problem is not restricted to the amygdala, however. Inferior frontal and orbitofrontal regions, likewise involved in emotion processing [80], are also zones of high static magnetic field gradient. In addition to signal loss, static magnetic field gradients also lead to echo times (TE) becoming shifted, so that BOLD sensitivity may be reduced, or signal may not be acquired at all (termed “Type 2” loss [81]). These problems are examined in Sect. 3.1.2.

Fig. 1
figure 1

The amygdalae, central brain structures in emotion processing, lie in a region of moderate deviation from the static magnetic field (left) and very high static magnetic field gradients (right). The planes intersect in the amygdala at MNI coordinate (18, −2, −18), marked by arrows. Single subject measurement at 4.0 T

Local variations in the static magnetic field strength confound spatial encoding of the MR signal, leading to image distortion. Particular considerations for the MTL in this regard are discussed in Sect. 3.1.3. Even at high field, deviations from B 0 immediately in the amygdala are relatively moderate (Fig. 1 left; 10 Hz measured at the arrow position, for data acquired at 4.0 T) but the field gradient is high (2 Hz/mm at the same position), leading to very large distortions in neighboring structures, which can cause signal to encroach into the amygdalae.

The ventral brain is also prone to physiological artifacts of cardiac and respiratory origin, as described in Sect. 3.1.4, which may be mitigated to some extent by simultaneous measurement of cardiac and respiratory processes and the application of postprocessing corrections. In addition to the measurement challenges of ventral brain imaging, the presence of large magnetic field gradients makes the ventral brain susceptible to stimulus-correlated motion (SCM) artifacts, as discussed in Sect. 3.1.5. These can lead to the appearance of neuronal activation (Fig. 2) arising from subtle head movements which are time locked to stimuli.

Fig. 2
figure 2

Large static magnetic field gradients make the amygdala region prone to the artifactual appearance of neuronal activation when stimulus-correlated motion (SCM) is present. Left : Observed patterns of SCM of schizophrenic patients and controls in a 3.0-T experiment with three stimulus blocks (facial emotion and age discrimination “EMO” and “AGE”). Right : a baseline (no stimulus) study in which a subject executed submillimeter SCM similar to that of Patient 1. The contrast corresponds to the “EMO” periods (uncorrected p < 0.0001; t threshold = 5, Montreal Neurological Institute coordinates 22, −6, −16)

A further potential confound is the presence of RSNs which colocalize with regions under study. These show slow fluctuations in the absence of stimuli and constitute sources of unmodeled noise and intertrial variation. The existence of a RSN in the amygdalae (Fig. 3) offers a possible explanation of why small signal changes are generally recorded in these structures, despite the high neurovascular reactivity of deep gray matter nuclei. This and other RSNs which may involve the amygdala are described in Sect. 3.1.6.

Fig. 3
figure 3

Signal changes in the amygdala in emotion experiments have to be measured against a background of resting state fluctuations. A resting state network recently been reported, covering the amygdala and basal ganglia (3.0 T, group independent component analysis of 26 young healthy adults). Adapted from [106] with permission from the ISMRM

In Sect. 3.1 we expand on the problems outlined here, and go on in Sect. 3.2 to detail approaches to optimizing conventional single-shot 2D gradient-recalled echo-planar imaging (EPI) to mitigate their effects, alternative sequences which are less sensitive to static magnetic field gradients and, in Sect. 3.3, methods to correct for image distortion, physiological noise, and SCM artifacts.

3.1.2 Signal Loss and BOLD Sensitivity Loss

It is worthwhile to briefly review the problem of signal loss from an empirical perspective. A temporal resolution of 1–3 s is usually desirable in fMRI. The whole brain may be covered in this time by acquiring images with voxels of typically 3-mm size (or 27 μl). Relatively long TEs are employed, partly also as a technical necessity—to allow time for gradient switching and echo sampling—but also to confer T *2 weighting. As well as providing sensitivity to BOLD effects, however, this allows time for dephasing from macroscopic inhomogeneities to develop. The severe signal loss seen in EPI in the anterior MTL with typical parameters is illustrated Fig. 4 in the lower left two images.

Fig. 4
figure 4

Effects of voxel size and acceleration factor on T *2 and echo-planar imaging (EPI) image quality at high field (4.0 T). Top: T *2 in coronal and axial slices through the amygdala at two voxel sizes. Bottom: corresponding EPI in slices through the amygdala with acquisition voxel sizes of 4 × 4 × 4 mm, 3 × 3 × 3 mm, 2 × 2 × 2 mm, and 2 × 2 × 2 with GRAPPA acceleration of factor 2, all with echo time (TE) = 32 ms

In gradient-echo imaging, the MR signal decays with a time constant T *2 , comprising the transverse relaxation time, T2 (reflecting irreversible decay arising from time-varying microscopic spin-spin processes), and T 2 , the reversible contribution to the transverse decay rate and the major source of BOLD contrast. T 2, itself can be separated into “mesoscopic” contributions (which operate on a scale smaller than the voxel, e.g., dephasing in the capillary bed), and “macroscopic” contributions (meaning larger than the voxel) which stem from bulk field inhomogeneities and which are dependent on the tissues present, on the quality of shim, and on the scanning parameters such as voxel size and slice orientation. Separating these effects, the MR signal S in a gradient-echo experiment decays such that at the TE it can be expressed [82] as:

$$ S\left(\mathrm{T}\mathrm{E}\right)=S(0)\times \exp \left(\frac{-\mathrm{T}\mathrm{E}}{T_2}\right)\times F\left(\mathrm{T}\mathrm{E}\right), $$
(1)

where an approximation to F(TE) for linear field variations over voxels, ΔBi, in the x, y, and z directions is

$$ F\left(\mathrm{T}\mathrm{E}\right)=\mathrm{sinc}\left(\frac{\upgamma \Delta {B}_x\mathrm{T}\mathrm{E}}{2}\right)\times \mathrm{sinc}\left(\frac{\upgamma \Delta {B}_y\mathrm{T}\mathrm{E}}{2}\right)\times \mathrm{sinc}\left(\frac{\upgamma \Delta {B}_z\mathrm{T}\mathrm{E}}{2}\right). $$
(2)

This illustrates that the signal decay rate may be reduced by decreasing the voxel size—to reduce the gradients across voxels, ΔBi—or by reducing the TE.

The aim of any attempt to optimize an EPI sequence is not just to maximize signal, described above, but also BOLD sensitivity (BS), which is equal to the product of image intensity and TE; for magnetically homogeneous regions is a maximum when the EPI effective TE is equal to the T *2 of the target region [83]. In homogeneous regions, however, the presence of field gradients shifts the location of signal in k-space, mainly in the phase-encode direction (because of the low bandwidth), changing the local TE [81]. Through-plane field gradients lead to signal loss and reduce BS. If the component of the in-plane susceptibility gradient in the phase-encode direction is antiparallel to the phase-encode gradient “blip” direction, then the TE is also reduced, reducing BS further. Conversely, if it is parallel to the phase-encode “blips” then TE increases. While this increases BS, to some extent compensating for signal loss, if the shift of TE is too large the echo will fall outside the acquisition window, leading to complete signal dropout. This is commonly observed in the anterior MTL for a negative-going phase-encode scheme.

This description motivates the optimization approaches to EPI in susceptibility-affected regions which will be outlined later in this section; compensating through-plane gradients, selecting image orientation and gradient direction to minimize echo shifts, and reducing voxel sizes to reduce field gradients. These techniques will be shown to increase both signal and BS.

3.1.3 Image Distortion

Accurate spatial encoding in MRI is founded upon a homogeneous static magnetic field in the object. The location of signal is deduced from the local field strength under the application of small orthogonal, linear magnetic fields in directions usually referred to as slice select, readout, and phase-encode. The method is confounded if there are regional variations in the static magnetic field, which lead to signal mislocalization (distortion). Typical field offsets are illustrated in Fig. 1 (left) and lead to EPI distortions of the image shown in Fig. 4.

The extent of distortion, expressed as the number of pixels by which signal is mislocalized, is equal to the local magnetic field deviation divided by the bandwidth per pixel (the reciprocal of the time between measuring adjacent points in k-space), expressed in the same units. The bandwidth per pixel in the readout direction (rBWread/pix) is equal to the total imaging bandwidth (the signal sampling rate) divided by the image matrix size in the readout direction. In EPI, the pixel bandwidth in the phase-encode direction is smaller than this again by a factor of the image matrix size in the phase-encode direction. The fact that total bandwidth is often increased in proportion with the readout matrix dimension in order to keep rBWread/pix constant means that distortion (in distance rather than number of pixels) is approximately constant as a function of matrix size (and thereby resolution, at constant matrix size). To illustrate the size of expected distortions, in a 64 × 64 matrix acquisition, a typical rBWread might be 1500 Hz/pixel, giving (as 1500/64) a rBWphase of 23 Hz/pixel. A value of ΔB 0 of 50 Hz (common at high fields, see Fig. 1) would lead to a shift of 0.03 voxels in the readout direction, but 2 voxels in the phase-encode direction, or 7 mm for a typical field of view for brain imaging. In a higher resolution acquisition with a 128 × 128 matrix and the same rBWread, rBWphase would be 12 Hz/pixels and the distortion 4 voxels, but also 7 mm because of the proportionately smaller voxel size.

The relationship between EPI distortion and field strength is not simple, depending both on hardware and usage. Susceptibility-induced field changes increase linearly with static magnetic field strength while gradient amplitude (the factor which limits sampling rate) is approximately constant in the standard to high field regimes. While theoretically this leads to an approximate proportionality between distortion and field strength, in practice higher acquisition bandwidths are often used at high field to the achieve shorter effective TEs, to match reduced T *2 times.

Image distortion frustrates attempts to coregister data from many subjects to a common probabilistic atlas [84], which can reduce significance in fMRI even in relatively homogeneous areas [85]. Established methods for correcting image distortion are compared for their performance in the amygdala in Sect. 3.3.1.

3.1.4 Physiological Artifacts

A number of physiological processes give rise to fluctuations in the MR signal which are unrelated to neuronal activation, and should therefore be corrected for or modeled in a statistical analysis. The amygdala area is particularly prone to cardiac artifacts due to the proximity of the arteries in the Circle of Willis, and to respiratory artifacts because of the susceptibility gradients.

Respiration leads to head motion, changes in the magnetic field distribution in the head due to changes of gas volume or oxygen concentration in the chest [86], and variation in the local oxyhemoglobin concentration, probably due to flow changes in draining veins [87]. Subtle changes in respiration rate and depth are thought to be the origin of spontaneous changes in arterial carbon dioxide level at about 0.03 Hz which have been shown to lead to significant low-frequency variations in BOLD signal [88]. The lag of 6 s in this process corresponds to the time taken for blood to transit from the lungs to the brain, and for cerebral blood flow volume to respond to CO2, a cerebral vasodilator. Magnetic field changes in the head particularly affect ventral brain imaging due high field gradients. Respiration-related artifacts typically affect the image periphery, making them problematic for the amygdala, which is usually at the anterior boundary of the signal-providing region.

Cardiac pulsatility causes expansion of the arteries, bulk motion of the brain, and cerebrospinal fluid flow and leads to the influx of fully relaxed spins into an imaging slice. As a consequence, the signal may increase in many of the arteries that lie close to the amygdala, such as the middle cerebral artery and other elements of the Circle of Willis [89]. Cardiac artifacts are particularly complex with regard to emotion studies as the amygdala innervates the autonomic nervous system via the hypothalamus and brainstem, increasing heart rate, as has been shown in fMRI [90], and human depth electrode studies [91]. Recently, fluctuations in cardiac rate have been shown to explain almost as much variation in the BOLD signal as the oscillations related to each cardiac cycle, as revealed by shifted cardiac rate regressors [92].

Cardiac and respiratory cycles are connected by a number of processes [93], leading to many regions showing BOLD fluctuations of cardiac origin [92] being also observed in studies of respiratory effects [94].

Cardiac and respiratory artifacts may be corrected for by a number of approaches, some of which require additional measurements at the time of imaging. The effectiveness of these techniques in the ventral brain is outlined in Sect. 3.3.2.

3.1.5 Motion Artifacts

Motion artifacts affect all regions of the brain, but are particularly problematic in emotion studies because the nature of the task material is prone to induce SCM as a startle, attention, or repulse response. Patients with disorders with emotional components (such as schizophrenia and posttraumatic stress disorder) are less likely to remain still throughout the experiment and the interaction between motion and distortion in regions of high susceptibility gradient produces nonlinear pixel shifts that are not well corrected with rigid-body methods. Partial brain coverage protocols, such as those that may be used to allow z-shimming or high spatial and temporal resolution fMRI in the amygdala, are also more prone to partial voluming in the outermost slices and spin history effects, in which motion between the acquisition of adjacent slices leads to some spins being excited twice within one repetition time (TR) while others are not excited at all.

Head motion can be minimized using bite bars, vacuum cushions, thermoplastic masks, or plaster head casts. As well as effective immobilization, casts allow for repositioning in longitudinal studies [95]. Such devices are not appropriate for emotion studies, however, due to the added degree of discomfort and distraction they provide.

SCM was originally investigated by Hajnal et al. [96] in hybrid simulations with quite large (3 mm) introduced pixel shifts, which led to peripheral correlations. A study by Field et al. [97] found that small-amplitude motion can lead to false positive results, particularly in regions of high field gradient. Likewise, larger motions can reduce significance and lead to false negative results. Two distinct patterns of SCM are often observed in fMRI experiments. As in the example of identified motion with sample schizophrenic patients and controls (Fig. 2, left), patients may execute large motions at the first presentation of a stimulus, and many patients and controls show very small displacements which endure for entire blocks. Reproducing the submillimeter head motions observed in that experiment in a separate session (without stimuli), these have been shown to lead to highly significant correlations in the amygdala which are difficult to distinguish from genuine activation (Fig. 2, right), a problem not mitigated by standard motion correction methods [98].

3.1.6 Colocalized Resting State Networks

An additional methodological confound comes in the form of RSNs, which constitute additional sources of signal fluctuations unrelated to experimental task. In the absence of tasks or stimuli, the brain undergoes slow (0.01–0.1 Hz) fluctuations in functionally related networks of brain regions [99, 100]. These endure during task execution, and have been shown to account not only for much of the intertrial variation in the BOLD response in evoked brain response [101], but also to the intertrial variability in behavior [102]. Approximately ten such RSNs have been discovered over the past decade [99, 100, 103105] in networks relating to sensory or cognitive function. A network with similar low-frequency characteristics has recently been identified in the amygdala and basal ganglia [106].

The network illustrated in Fig. 3 shows the results from a group of independent component analysis (ICA), performed with MELODIC [107], of resting state data acquired from 26 subjects. It is continuous, fully incorporating symmetrically the striate nuclei (pallidum, puitamen, and caudate nuclei), extending inferiorly to the amygdaloid complexes. The network is weaker than those previously reported (measured by the amount of variance it explains in the data), but is reproducible across subgroups of subjects, runs, and resting state conditions (fixation and eyes closed) and offers a tantalizing explanation as to why, despite the fact that neurovascular reactivity is high in deep gray nuclei, BOLD signal changes are weaker and less consistent in the amygdalae and basal ganglia than in the cortex.

This may not be the only RSN in which the amygdala is involved. Correlations were observed between the amygdalae, and between the amygdalae and hippocampi and anterior temporal lobes in one of the earliest resting state analyses, using functional connectivity [100]. The amygdala was also listed as an element in the “default mode” network [108], when originally reported as regions showing deactivations across a number of tasks in PET [109]. The fact that the amygdala has not been observed as part of this network in this context may relate to the technical challenges of measurement discussed in this chapter.

3.2 MR Methods, Sequences, and Protocols

3.2.1 Field Strength

While the signal to noise ratio (SNR), the magnitude of BOLD signal changes , and the specificity of the BOLD response to microvascular contributions all increase with field strength, so do physiological noise, field inhomogeneities, and physiological artifacts which specifically affect the anterior MTL. The advantages of high field for emotion studies are therefore restricted to particular regimes and methods in which these problems are minimized. Human emotion fMRI studies have been carried out at field strengths from 1.0 to 7.0 T. In line with the development of sequences and approaches to EPI in susceptibility-affected area which are discussed in Sect. 3.2.23.2.8 (high-resolution single and multishot EPI, multiecho and spiral acquisitions, gradient compensation, and parallel imaging), emotion fMRI in the high field regime (3.0–4.0 T) has become commonplace, although applied studies have generally used standard sequences and parameters despite the problems which have received attention in the MR literature [110] and a number of promising remedies (see the following sections). Ultra-high field strength studies of emotion are still sparse, however, and it is likely that they will be restricted to highly specific questions during the next 5–10 years of hardware and sequence development.

Theoretical gains in SNR at high field are limited by physiological noise, which increases both with field strength and voxel size, and causes time-series SNR (tSNR) to reach as asymptotic limit with voxel volume [111]. This limit was found to increase only modestly with field strength, being 65 at 1.5 T, 75 at 3 T, and 90 at 7 T, so that for large (5 × 5 × 3 mm) voxels, tSNR was only 11 % higher at 3 T than at 1.5 T, and only 25 % higher at 7 T than 1.5 T. The tendency toward asymptotic behavior began at relatively small volume volumes, with 80 % of the asymptotic maximum being reached at 28.6, 15.0, and 11.7 mm3 at 1.5, 3, and 7 T, respectively. For small voxels, however, where thermal noise dominates, tSNR gains were almost linear with field strength. In the same study, the authors found that with 1.5 × 1.5 × 3 mm3 voxels, tSNR increased by 110 % at 3 T compared to 1.5 T, and by 245 % at 7 T compared to 1.5 T [111]. This study clearly shows that tSNR gains are to be made at high field in the small voxel volume regime.

These tSNR results also explain the often modest gains achieved in fMRI studies at higher field, particularly in regions affected by signal dropout. Krasnow et al. [112] compared activation in response to perceptual, cognitive, and affective tasks at 1.5 and 3 T with a relatively large voxel protocol (3 × 3 × 4 mm) and observed only moderate increases in activated volume at 3 T for the perceptual and cognitive tasks (23 and 36 %, respectively), but no significant improvement in the activated amygdala volume due to increased susceptibility-related signal loss. A high-resolution, high-field approach has been exemplified in the only human study of amygdala function at 7 T to date of which we are aware, which was carried out at submillimeter resolution [113].

These studies define the regime in which field strength gains are to be made, but it is fair to ask why one should move to high-resolution measurements if the neuroscience question does not require, for instance, subnuclei of the amygdala to be resolved, but—as is more commonly the case—the study of interactions between the amygdalae and the cortex, for which whole brain coverage is essential. The use of high resolution here is not principally to distinguish activation in small structures, but to reduce both physiological noise and susceptibility artifacts. A number of works have shown the value of averaging thin slices, downsampling, and smoothing data acquired at high resolution [114116] and using multichannel coils [115] to regain losses in SNR inherent to small voxels generally and yielding net gains in susceptibility affected areas [115, 117].

3.2.2 z-Shimming, Gradient Compensation, Tailored RF Pulses

The effect of signal dephasing arising from through-plane gradients may be reduced by creating a composite image from a number of acquisitions in which different slice-select gradients are applied [118], a process known as z-shimming. In each image the applied gradient pulse is appropriate to counteract susceptibility gradients in particular regions. The method is effective in regaining signal in the anterior MTL, but clearly reduces temporal resolution by a factor equal to the number of images acquired, usually a minimum of 3. Alternatively, a single, moderate preparation pulse may be used. This reduces through-plane dephasing in affected areas at limited cost to BS and signal in homogeneous areas, and allows slices to be orientated so that TE shifts are small, reducing signal loss due to in-plane gradients [119]. z-Shimming and other compensation schemes have been applied in a number of other sequences described in this section.

Spins may also be refocused using tailored radio frequency pulses which create uniform in-plane phase but quadratic phase variation through the slice, allowing dephasing to be “precompensated” [120]. Analogous to z-shimming, in the original implementation a number of acquisitions with different precompensations were required, suited to different regions. More recently 3D versions have been developed, and while these are promising the pulse lengths are long, and the distribution of susceptibilities must be known [121], or calculated iteratively online [122]. These are, however, important steps toward single-shot compensation of susceptibility dropout.

3.2.3 Slice Orientation and Gradient Directions

Divergent findings and recommendations for the optimum slice orientation for amygdala fMRI are due to the absence, until relatively recently, of an adequate description of signal loss and BS in the presence of field gradients [81, 119, 123].

In many early studies, quite nonisotropic voxels were used to achieve short TR while minimizing demands on scanner hardware, with slice thickness being substantially larger than the in-plane voxel size. Gradients across voxels were highest then, and signal loss most severe, if the direction of strongest field gradient was along the slice (through-plane) direction [124]. With many studies finding that the direction of the field vector across the amygdala was principally superior-inferior [125], this prescription precluded an axial orientation. As bilateral structures, the amygdala could be imaged in the same slice in the coronal but not the sagittal planes, leading to the coronal orientation being preferred by many [110].

The optimum imaging plane is also dependent on whether gradient compensation is used [81]. If so, through-plane gradients may be compensated for with a moderate gradient in the slice direction, although this will lead to a small decrease in BS in unaffected areas. The slice can then be orientated so that in-plane gradients are below the critical threshold for Type 2 signal loss. The value of this has been demonstrated in the orbitofrontal cortex [119] but the approach yields lower rewards in the amygdala region [126] as gradients are higher (making it more difficult to find a suitable value for compensation), and are more variable between subjects.

The simulations of Chen et al. [125] for the amygdala suggested that the maximum BS was to be achieved by orienting the slice direction perpendicular to the maximum gradient vector and the readout direction parallel to it, indicating an (oblique) coronal orientation with superior–inferior readout. The angle between the gradient vector and the superior–inferior direction was shown to vary widely between subjects (from −7° to +26° at 1.5 T, from −5° to +34° at 3 T), meaning that field gradients need to be mapped for each subject before measurement. This scheme also invokes distortions which are asymmetric about the midline (left–right). If erroneous conclusions about lateralization are to be avoided, residual distortions in the amygdala should be symmetric, requiring the phase–encode direction to be superior–inferior for coronal slices or anterior–posterior for axial slices.

As well as the direction of imaging gradients, the sign of phase-encode blips is important for signal loss and BS [123]. Encoding in EPI can be either with a large positive phase-encode “prewinder” followed by a succession of small negative “blips,” or a negative prewinder followed by positive blips. In homogeneous fields these schemes are equivalent, but we have seen that in the presence of susceptibility gradients echo positions are shifted away from the center of k-space, along the phase-encode axis. Positive and negative blip schemes have quite different properties, therefore, depending on whether the component of susceptibility gradient in the phase-encode direction is itself positive or negative [123]. The phase-encode direction (PE), slice angle, and z-shimming prepulse gradient moments (PP) that lead to maximum BS for EPI with otherwise standard EPI parameters (TE = 50/30 ms at 1.5 T/3 T, 3 × 3 × 2 mm3 voxels) have been measured throughout the brain by Weiskopf et al. at 1.5 and 3 T [126]. They define positive slice angles as being those in which, beginning from the axial plane, the anterior edge is tilted toward the feet, and a positive PE as being that in which the prewinder gradient points from the posterior to the anterior of the brain. In the amygdala they find that the highest BS is achieved with positive PE, a −45° slice tilt and a PP = +0.6 mT/m ms at 3 T, and positive PE, −45° slice tilt and PP = 0.0 mT/m ms at 1.5 T. These values led to a 14 % increase in BS at 3 T over a standard acquisition (with positive PE, a −0° slice tilt and a PP = −0.4 mT/m ms) but only 5 % at 1.5 T. This indicates that BS can be increased by selecting optimum geometry parameters and compensations gradients, although improvement is more modest than that which has been demonstrated with the more technically challenging or time-consuming strategies described in this chapter. The gradient and geometry values suggested in Weiskopf et al. [126] should be adopted for EPI with standard parameters at these field strengths. At other field strengths their analysis could be followed, or interpolated values adopted from the trends evident in that study.

3.2.4 Voxel Size

Among many solutions to the problem of signal loss in the anterior MTL, reduced voxel size was established very early as an effective means of mitigating susceptibility-related signal loss [127, 128]. Equation (2) describes how the rate of signal decay is reduced with voxel size by lowering field gradients across voxels. The effectiveness of this can be seen in the 4-T images of Fig. 4 over a range of resolutions, with T *2 in the amygdala (measured with a multiple gradient-echo sequence with the same geometry as the EPI) increasing from 22 to 38 ms when the voxel size is reduced from 64 to 8 mm3, with corresponding EPI signal increase apparent in the anterior MTL.

Reducing voxel size comes at the expense of temporal resolution (or brain coverage) and SNR. The relationship between image SNR and voxel volume, ΔV, is

$$ \mathrm{S}\mathrm{N}\mathrm{R}=\Delta V\sqrt{\frac{N_x{N}_y{N}_z}{\mathrm{rBW}},} $$
(3)

where Ni is the number of samples in direction i and rBW the receiver bandwidth [129]. The commonly held view that voxel volume is simply proportional to SNR is premised on changing the volume via the field of view [130], or that, in addition to increasing Nx and Ny by a factor f, (considering only in-plane resolution) receiver bandwidth is also increased by the same factor. If receiver bandwidth and field of view are held constant, however, then we see from Eq. (3) (because \( {N}_x{N}_y{N}_z=k/\Delta V\Big|{}_{\mathrm{FOV}}, \) where k is the total imaged volume) that SNR is proportional to the square root of the voxel volume, and SNR may be restored by downsampling high-resolution images. In this time-consuming scheme, partial k-space acquisition may be used to achieve the desired TE, SNR can be increased with multichannel coils, as has been validated for the MTL [115] and parallel imaging used to reduce an otherwise long TR.

While this analysis provides the basis for the dependence of image signal on imaging parameters, it neglects the effects of physiological noise. The most important measure of signal in this context is tSNR, which translates into the feasibility of detecting a specified signal change in fMRI [131] and has been shown to be useful in assessing the viability of amygdala fMRI in individual subjects [132]. In a study of optimum parameters for GE-EPI for 3-T amygdala EPI with a volume coil, a protocol with approximately 2-mm isotropic voxels was found to yield 60 % higher tSNR than a protocol with standard parameters (with approximately 4-mm isotropic voxels) [117], despite having been measured at twice the receiver bandwidth. Additional gains with smaller voxels (thinner slices) were not large, because T *2 had already increased to a value close to that in homogeneous regions. This is in concordance with models calculations which suggest that 2 mm represents the smallest voxel size that should be used for amygdala imaging providing the activated size is itself at least 2 mm [125].

There are many differences between the conditions and metrics of the methodological work cited and typical fMRI studies. It is encouraging, therefore, that these findings have been confirmed in the significance and extent of amygdala activation in fMRI experiments [133, 134].

In summary, small voxels should be used in high field strength studies in order to operate in a regime dominated by technical, rather than physiological noise. In inhomogeneous regions this results in reduced field gradients, reducing signal loss and echo shifts, making BS more uniform in the volume. Time-series SNR may be increased by using multichannel coils and downsampling small voxels.

3.2.5 Echo Time

Taking the simplest approach of matching effective echo time (TEeff) to the T *2 of the structures of interest in GE-EPI might be seen as being problematic in large voxel size acquisitions, with T *2 s varying quite widely (e.g., between the amygdala and the fusiform face area). One solution is to use a multiecho sequence, in which the each time of each image is appropriate for regions with particular field gradients, as will be described in more detail in Sect. 3.2.8. A novel solution to matching TEeff to T *2 in the amygdala without sacrificing BS in more dorsal slices is to use an axial acquisition with slice-specific TE, demonstrated at 1.5 T with TEeff = 60 ms in dorsal slices, TEeff = 40 ms in ventral slices, and a transition zone with intermediate effective TE [135].

It should be remembered, though, that the maximum of BS is quite flat as a function of TE, and TE is itself not well defined in EPI. In the previous sections, we also saw that in-plane susceptibility gradients change local TE [81]. This exposes the limitation of the approach of simply reducing the TEeff of the sequence. In the common, negative blip scheme, signal in the anterior MTL will in fact be shifted to a longer TE. Using a short TEeff makes the sequence more prone to complete (type 2) signal loss.

This explains the experimental findings of Gorno-Tempini et al. [136] and Morawetz et al. [134]. In 2-T dual-echo EPI with large voxels, Gorno-Tempini et al. found that although signal loss was reduced at the short TE (26 ms) BOLD activation was significantly greater in the hippocampus at the longer TE (40 ms). Morawetz et al. [134] studied four EPI protocols in their efficacy at mapping amygdala activation, using variants with two different TE (27 and 36 ms) and slices thicknesses (2 and 4 mm), all with high in-plane resolution (2 mm). Activation results were poor in the 4-mm protocols, even at the shorter TE.

A more effective approach than reducing TEeff is to reduce susceptibility gradients, and thereby signal dephasing and echo shifts, using the techniques described earlier; gradient compensation, selection of appropriate gradient direction and slice orientation, and the use of smaller voxels. This increases T *2 in susceptibility-affected regions and, by reducing echo shifts, makes BS more homogeneous throughout the imaging volume. Conditions then approach those with a homogeneous static field, where BS is maximized by using \( {\mathrm{TE}}_{\mathrm{eff}}={T}_2^{*} \).

The increase in T *2 in the amygdala with reduced voxel size is illustrated at 4 T in Fig. 4; from 22 ms in a 4 × 4 × 4-mm acquisition to 38 ms in 2 × 2 × 2-mm data, consistent with previous results at 3 T [117]. Likewise, increase in BS was illustrated in the Morawetz et al. study [134], in which robust amygdala activation was only detectable in the high-resolution acquisition.

3.2.6 Parallel Imaging

The previous sections have shown that many of the techniques which mitigate susceptibility-related signal loss in the amygdala, hypothalamus, and MTL are also time consuming, limiting either temporal resolution or brain coverage. This is undesirable where brain coverage cannot be reduced to the amygdala. Parallel imaging allows acceleration by undersampling k-space and using the sensitivity profiles of a number of receiver channel to reconstruct data without image fold-over [137, 138]. By this means it is possible to reduce TEeff, which reduces susceptibility loss, and to reduce TR by the acceleration factor. Image distortions and echo shifts are likewise reduced by the acceleration factor so that even at the same effective TE as in a conventional acquisition, signal loss in the amygdala region is lower (Fig. 4, bottom right). The noise properties of images reconstructed from parallel acquisition lead to BS reductions of the order of 15–20 % in other regions, however [139].

The effectiveness of parallel imaging and suitable acceleration factors for the MTL have been studied by Schmidt et al. [140]. Statistical power in the study of MTL activation was higher in the parallel-acquisition data with an acceleration factor of 2 than in the acquisition without acceleration, but neither image quality nor statistical power improved with higher acceleration factors, as noise and reconstruction artifacts reduced tSNR prohibitively. Particular gains in BS can be made in the MTL using parallel imaging with a modest acceleration factor combined with high-resolution imaging [115]. Combining parallel imaging, high-resolution and high field has even allowed differential response of the hypothalamus to be recorded in response to funny as opposed to neutral stimuli at 3 T [141, 142], which could potentially be used to diagnose narcolepsy and cataplexy.

3.2.7 Flip Angle

The following is a consideration which is common to fMRI studies in all brain regions. The flip angle that should be used in a sequence is that which maximizes the signal with a particular experimental TR. In a spoiled gradient-echo sequence this is the Ernst angle, θE, given by

$$ {\theta}_{\mathrm{E}}= \arccos \left({\mathrm{e}}^{\frac{ - \mathrm{T}\mathrm{R}}{T_1}}\right). $$

T 1 values can be taken from the literature, if available, or mapped in a single study of a representative group of subjects, mostly simply using an inversion recovery sequence and a range of inversion times. At high (3.0–4.0 T) and very high field (7.0 T or higher), dielectric effects lead to B 1 inhomogeneity, and flip angles achieved deviate from nominal values. Particularly at 7.0 T it is worthwhile to map the RF field [e.g., using the 180° signal null point using a simple spoiled gradient-echo sequence [143] to calibrate nominal flip angles].

3.2.8 Alternatives to 2D, Single-Shot, Gradient-Echo EPI

If multiple echo images are acquired following a single excitation, the range of TEeff in these provides near-optimum BS for a number of regions [144, 145]. Images acquired at different TEs may be analyzed separately, or combined to maximize BOLD contrast-to-noise ratio [145]. Acquiring multiple images in a single shot also allows additional features to be built into the sequence, such as 3D gradient compensation, in which different combinations of compensation gradients are applied to each echo [146], leading to excellent signal recovery in the amygdala in the combined image [147]. Alternatively, the phase-encoding gradient polarity may be reversed to yield images with distortions in opposite directions, allowing for their correction [148].

Similar multiecho and compensation techniques have been applied to spiral acquisitions. A spiral-in trajectory has been shown to reduce signal loss compared to a conventional spiral–out scheme with the same TE, and SNR and BS could be increased with a spiral in–out scheme by combining images optimally from the two acquisitions [149]. A number of variants of this have been developed to further reduce susceptibility artifacts, including applying a z-shim gradient to the second echo [150] or subject-dependent slice-specific z-shims to both echoes [151].

A number of segmented methods are being developed to overcome the temporal constraints of multiecho and high-resolution acquisitions. In conventional segmented EPI, subsets of interleaved k-space lines are acquired after successive excitations. The higher phase-encode bandwidth leads to reduced distortions and smaller echo shifts, but the method is inherently slow and prone to motion and physiological fluctuations, as each image is built up over a number of TRs. In the MESBAC sequence, navigator echoes are acquired in both the readout and phase-encode directions between each segment. Multiple echoes are acquired with different amounts of compensation for each echo [152], and combined to give impressive signal in inferior frontal areas.

3.2.9 Summary

In the subsections of Sect. 3.2 we have looked at the influence of field strength, gradient compensation, slice orientation, voxel size, TE, and acquisition acceleration factor on susceptibility-related signal and BS reduction in the anterior MTL, as well as discussing some variants of multiecho and spiral schemes which have been tailored for this region. While the interdependent nature of EPI parameters and changing considerations at different field strength necessarily make some considerations complex, we would like to pick out two lines of approach presented here as being particularly effective, and clarify recommendations.

The first approach is high-field, high-resolution single-shot EPI with gradient compensation and acceleration. BOLD signal changes are greater at high field (3.0–4.0 T), and the tSNR advantages of high field strength are capitalized upon by measuring with small (circa 8-μl voxels), where thermal noise rather than physiological noise dominates. Measuring with small voxels reduces signal dephasing, making T *2 more homogeneous. Shifts in local TE are also less, reducing Type 2 signal loss and increasing BOLD sensitivity. Moderate slice select gradient compensation and an oblique axial acquisition with a tilt between 20 and 45° (anterior slice edge toward the head) reduces in-plane gradients and echo shifts further. With susceptibility gradients reduced—evidenced by T *2 values close to those in magnetically homogeneous regions—BS can be maximized by setting the \( {\mathrm{TE}}_{\mathrm{eff}}={T}_2^{*} \). The TEeff can be reached using parallel imaging acceleration (e.g., factor 2), which further reduces both TE shifts and image distortion. Images acquired with these parameters have high signal in the anterior MTL, low distortion, and quite homogenous BS. Time-series SNR can be increased before statistical analysis by downsampling or smoothing images. This approach is attractive in that it may be achieved on most modern high field systems.

Not only the value of gradient compensation was discussed in Sect. 3.2.2, but also the high cost in temporal resolution, if images with a number of compensation gradients are acquired. The second approach we wish to highlight involves the application of a range of compensation gradients to each of a number of echoes acquired after a single excitation, so reducing the time penalty. Both the multiecho echo-planar [146] and multiecho spiral acquisitions [151] described in Sect. 3.2.8 have been shown to be effective in reducing susceptibility-related signal loss in the anterior MTL.

3.3 Correction Methods

3.3.1 Distortion Correction with the Field Map and Point-Spread Function Methods

The field map (FM) method was first described by Weisskoff and Davis [153] and developed by Jezzard and Balaban [154]. In Sect. 3.1.2 we saw that distortion in EPI is only significant in the phase-encode direction and that the number of pixels by which signal is mislocated is equal to the local field offset divided by the bandwidth per pixel in the phase-encode direction. In the fieldmap method, static magnetic field deviations, ΔB, are calculated from the phase difference, Δϕ, between two scans with TE separated by ΔTE (or a dual-echo scan), using the relation \( \varDelta B=2\pi \gamma \Delta \mathrm{T}\mathrm{E}\varDelta \varphi . \) This map is distorted (forward-warped) to provide a map of the voxel shifts required to reverse the distortion at each EPI location. Gaps in the corrected image are filled by interpolation.

While undemanding from the sequence perspective, considerable postprocessing is required to produce FMs that do not contain errors. Phase imaging is only capable of encoding phase values in a 2π range, with values outside this range being aliased, causing “wraps” in the image. These can be removed in the spatial domain using a number of freely available algorithms (e.g., PRELUDE [155] or ΠUN [156]), or by examining voxel-wise phase evolution in time if three or more echoes are acquired [157]. If imaging is being carried out with a multichannel radiofrequency receive coil, phase images created via the sum-of-squares reconstruction [158] will show nonphysical discontinuities from arbitrary phase offsets between the coil channels (incongruent wraps) unless these offsets are removed [159, 160]. Alternatively, images from channels may be processed separately and individual FMs, weighted by coil sensitivities, combined. In 2D spatial unwrapping, additional global, erroneous 2π phase changes are occasionally inferred between TE when the algorithm begins to unwrap from different sides of a phase wrap at the two TE. In multichannel imaging, these slice phase shifts may be identified by examining the consistency between coil channels [161], as may unreliable voxels at the image edge and in regions of high-field gradient. The FM may finally need to be smoothed to remove high frequency features and dilated to ensure that it extends to the periphery of the brain.

In the point spread function (PSF) approach [162] applied to distortion correction [163], the imaging sequence is similar to EPI, but with the initial phase prewinder gradient replaced by a phase gradient table, the values are applied in a loop. The PSF of each voxel is the Fourier transform of the acquired data, and the displacement of the voxel is the shift of the center of the PSF (e.g., if the center of this is at zero additional phase, this corresponds to no local field offset). For one major scanner manufacturer, this method has been robustly implemented with the flexibility to be used for parallel imaging with high acceleration factors [164].

The FM and PSF methods have been compared at 1.5 T [163]. The PSF was found to be generally superior, although some conclusions were based on deficiencies in FMs in regions of high field gradient which may be improved upon.

The effectiveness of the two methods in correcting larger distortions at 4.0 T is shown in Fig. 5, focusing on a section through the amygdala (top row), and comparing this with the situation in a more dorsal slice (bottom row). Raw and corrected EPIs are compared to a gradient-echo reference which has the same (subvoxel) distortion in the readout direction, but no distortion in the phase-encode direction. The distortion at the anterior boundary of the amygdala (A) is circa 3 mm—moderate compared to the displacement of the ventricles (9 mm at B) and the frontal gray-white matter border indicated at C (12 mm). If the multiplicity of phase information available from multichannel coils is used in the FM method [161], both FM and PSF methods perform very well in all areas, with only minor errors at the periphery of the FM-corrected images due to residual field map inaccuracies at those locations (at D, not present in the PSF-corrected images).

Fig. 5
figure 5

Distortion correction of echo-planar imaging (EPI) at high field (4.0 T). A comparison of field-map (column 3) and point-spread function (column 4) correction of distortion in EPI (column 2) at the level of the amygdala (top row) compared to a more dorsal section (bottom row). Salient features have been copied from a gradient-echo geometric reference scan (column 1)

The choice of correction method is often a pragmatic one based on which is more robustly and conveniently implemented.

3.3.2 Correction of Physiological Artifacts

Physiological fluctuation in a sequence of gradient-echo images can be corrected using a navigator echo technique [165]. A single echo is acquired before the encoding scheme is begun and used to amend the phase changes in the image data which arise from susceptibility effects. This “global” correction approach, using the central k-space point only, can be extended to 1D [166] and 2D [167]. These methods are effective, but have as drawbacks an increase in TR.

To avoid them being aliased in EPI time series, respiratory fluctuations (circa 0.2–0.3 Hz) and cardiac fluctuations (circa 1 Hz) would need to be sampled at least at 2 Hz. That is, the TR of the sequence would need to be 500 ms or less. Typical TRs in whole-brain fMRI are 1–4 s, and the previous sections have indicated that many of the strategies that should be implemented to improve data quality in fMRI for emotion studies lead to longer repetition times. Respiratory and cardiac fluctuations will normally be aliased, then, and not generally into a particular frequency band [168]. Simple band-pass filtering is therefore not generally possible; although a range of alternative correction methods have been developed.

A class of correction methods requires additional physiological measurement to be made concurrent with the fMRI time-series, using a respiration belt to monitor breathing and an electrocardiogram or pulse oximeter to monitor heart rate. Applied in image space, the RETROICOR correction method involves plotting pixels according to their acquisition time within the respiratory cycle (classified also by respiration depth) and subtracting a fit to fluctuations over the cycle [169]. Despite the many reasons why physiological artifacts are expected to particularly affect amygdala fMRI, their correction with RETROICOR was found to bring only modest improvements in group fMRI results in an emotion processing task; up to 13 % in t statistic values depending on the degree of smoothing [170]. Those improvements were mostly due to correction of cardiac effects. Recent findings that cardiac rate changes lead to signal changes of similar size to the effect due to cardiac action itself [88], which are not modeled in the RETROICOR approach, suggest that further gains are possible.

Modeling physiological fluctuations [171] by including measured signal as “Nuisance Variable Regressors (NVRs)” is a convenient alternative to fitting and removing them. A detailed examination of these and other sources of noise showed respiratory-induced noise particularly at the edge of the brain, larger veins and ventricles, and cardiac-induced noise focused on the middle cerebral artery and Circle of Willis, close to the amygdala [168], which could be well modeled.

A number of image-based methods for physiological artifact correction have been developed, which do not require physiological monitoring data. Physiological fluctuations can be modeled with NVRs based on ventricular and white matter ROI values [172]. Alternatively, the data can be decomposed using ICA (e.g., MELODIC [173] or GIFT [174]) and components relating to physiological processes identified with automated or semiautomated methods. These can be based on experimental thresholds [175], statistical testing [176], automatic thresholding [177], or supervised classifiers [178]. Once identified, these components can be removed from the data. While in their infancy, these methods are very promising, particularly for the ventral brain. Tohka et al., for instance, demonstrated marked Z-score increases in frontal ventral regions and other areas close to susceptibility artifacts.

3.3.3 Correction of Stimulus-Correlated Motion Artifacts

In patient group studies, Bullmore et al. [179] have shown the need to compare the extent to which SCM explains variance between the groups, and suggest that this be identified using an analysis of covariance (ANCOVA). Without this approach, differences between the groups arising from higher SCM in the schizophrenic group in their study would have been attributed to differential activation in response to the task.

In the example of Fig. 2 (left), realignment of the time series in the motion-only replication did not substantially reduce the amygdala SCM artifact (right), but including identified motion parameters in the model as NVRs was effective [168, 180]. Alternatively, a boxcar NVR corresponding to presentation and response periods can be included in the model [181]. This and a number of other studies [182] have shown that the temporal shift in response introduced by the hemodynamic response function (HRF) makes it possible to separate motion from activation for short presentation periods, making event-related designs less sensitive to motion than block designs.

4 Summary and Discussion

Emotional neuroimaging is a rapidly expanding area that provides an interface between neurobiological work and psychophysiological emotion research. One important view that has emerged from the area of behavioral neuroscience is that emotional processes play a central role in the adaptive modulation of perceptual encoding, learning and memory, attention, decision-making, and control of action [9]. Many of neuroimaging studies have demonstrated that amygdala activation, for example, modulates attention and memory storage in other brain regions such as the hippocampus, striatum, and neocortex. Such interactions may occur as facilitations or modulations of neurocognitive function at several levels of processing. Conversely, recent work has shown that the organism is prevented from excessive emotional activation not only by low-level habituation or negative feedback mechanisms but also as a result of protective inhibition processes. Diverse behavioral and cognitive strategies have been identified that modulate and downregulate the ongoing emotion process [6]. The modulating effects on emotional arousal during an emotional episode such as rejection (venting and disengagement) or accommodation (relaxation, distraction, reconceptualization, rationalization, or reappraisal) deserve further inspection with respect to the involved neural mechanisms.

Although important advances have been made in the area of human emotion perception, learning, and autonomic conditioning, research has typically been limited to a small number of primary and mostly negative emotions such as fear, anger, or disgust. Limiting the range of investigated categories (neglecting shame, guilt, interest, etc.), dimensions (neglecting positive emotions such as care, support, etc.) and behavioral procedures does not do justice to the complexity of the multistage emotional appraisal process described above [29]. It is equally important but more difficult to identify the correlates of complex emotions such as those resulting from beliefs, preferences, predispositions, or interpersonal exchange. Not only the social dimensions such as untrustworthiness or dishonesty [183, 184], but also positive aspects such as social fairness [185], trust, and supportiveness play a role. Moreover, an understanding of modulating sociocultural influences is essential for a comprehensive conceptualization of human emotion [29].

Current neuroimaging research on emotion can be described as an ongoing construct validation process [186], which draws upon convergent and divergent associations of local activation variables and psychological constructs. The experimental measures (operationalizations of psychological constructs) are expected to be correlated with regional brain activations. It is evaluated whether topographically distinct patterns of activation in a certain region consistently predict engagement of different processes (for an example in the area of cognitive processing, see Ref. [187]). Indicators of a different construct are expected to correlate with activations of different areas. This corresponds to the well-known double dissociation strategy that inspects task by localization interactions in neuropsychology [64].

This validation process typically starts at a relatively broad construct level and proceeds downward in the hierarchy of constructs to finally specify within-systems constructs. Previous studies have demonstrated a relatively high cross-laboratory repeatability of emotional brain activation patterns at a higher systems level. At lower levels, however, the reliability of psychological or activation data may decline depending on limitations of the instruments.

Nonetheless, high-field fMRI scanners permit an improved discrimination of activations, for example, within the different subnuclei of the amygdala [5]. It is evident that increased discrimination on the neural side must be accompanied by a refined technology to assess more fine-grained emotional constructs on the behavioral side.

Neuropsychological construct validation requires additional physiological data to obtain some kind of convergent information about the indicator variable. At the neurophysiological level, the perfusion mechanisms has been elucidated by combining the greater spatial resolution of fMRI with the real-time resolution of intracortical local field ERP (LFP) recordings. The neurophysiological coupling mechanisms of neural activity and the BOLD response can thus be assessed [188]. An application of both fMRI methods and electrophysiological approaches (e.g., surface and deep electrode recordings from limbic brain structures) is useful [189]. The combination of brain perfusion changes and electrophysiological correlates of oscillatory coupling will foster the understanding of the neural interaction processes within frontal and temporal networks [190].

On the level of the autonomic nervous system, multivariate coregistrations of psychophysiological response patterns including emotion modulated startle, heart rate variability, or cortisol secretion alleviate the validation of experimentally induced emotions or presence of specific emotional disorders.

Emotional neuroimaging has continuously profited from improvement in scanning techniques and the adaptation and standardization of signal processing strategies. However, this area has not only benefited from the diverse contributions of its subdisciplines but also inherited their methodological problems. An inspection of brain imaging studies of emotion showed that measurement quality may be influenced by many factors: by a rapid and differential habituation of responses to emotional stimuli in some regions; by artifacts of certain signal scaling techniques that are applied by default; by situational or state-dependent influences; and by insufficient validation of the emotion to be elicited (manipulation check). Interindividual differences of emotional regulation behavior appear to modulate event-related reactions during the time course of the experiment.

Some of the many approaches to reducing signal loss in EPI in the anterior MTL have been outlined here, as well as some of the methods for identifying and correcting artifacts arising from SCM , distortion, and physiological artifacts. Despite the gravity of the problem and the effectiveness of some of these strategies, the overwhelming majority of fMRI studies of the emotions use the same measurement protocols and analysis methods as have been applied to study cognitive function over the last decade.

Combining many of the simpler strategies described here—high field strength, small voxel volumes, partial k-space acquisition with the correction of physiological and SCM artifacts—allows reliable results to be achieved in the anterior MTL [191]. Figure 6 demonstrates such an example; the detection of subtle differences in amygdala activation between explicit and implicit emotion processing [192].

Fig. 6
figure 6

High-resolution imaging detailed in this chapter allows the acquisition of low-artifact echo-planar imaging (EPI) and allows subtle processing effects to be distinguished. Group results from 29 subjects for the conditions (a) emotion recognition (b) implicit emotion processing (age discrimination) and (c) the difference between the two conditions (3.0 T). Results, showing activation in the amygdala and fusiform gyrus (as well as cerebellum and brainstem) are overlaid on mean EPI and thresholded at p = 0.05, family-wise error corrected. Reprinted from [192], with permission

Moreover, new research designs and analysis methods such as Structural Equation Modeling or Dynamic Causal Modeling are now available to inspect the effective or causal connectivity that, for example, permits the PFC to modulate amygdala activity [62]. The influences of individual brain regions on each another can also been studied by combining functional brain imaging with the lesion approach or transcranial magnetic stimulation [193].

We have raised a number of caveats that highlight some of the limitations of emotion assessment in a scanner environment. As has been argued above, a lack of representativeness must be noted, that is, emotion includes a much broader conceptual network than currently covered by neuroimaging research. Thus, generalizations to other areas of functioning remain difficult. Representative designs are needed that pay greater attention to high-level strategies that depend on sociocultural factors and initiate, modulate, or regulate emotions. Moreover, the representativeness of results is limited due to small, selected, and poorly described study groups. Finally, since emotion elicitation in the scanner has been highly artificial, the power to predict emotions outside the neuroimaging context remains questionable. An ecological functional brain imaging approach that includes natural behaviors and environmental contexts of presentation may help to obtain a more representative view of real-life emotions.

Subject-specific mechanisms regulating the strength and temporal pattern of response to emotional stimuli and the balance of excitatory and inhibitory processes are of particular interest. The variability of the BOLD response between trials and across the time course of the experiment needs to be explained. Future research may therefore examine individual and group differences with a view to resolving inconsistencies in the literature [5, 12, 77]. Investigations into personality disorders or psychiatric diseases will provide further insight into the dispositional factors modifying the response to situational stressors. Paradigms specifically adapted to the investigated disorder may help to identify prefrontal dysfunction and associated failure to tonically inhibit amygdala output or to recognize safety signals eventually inducing sympathetic overactivity [194]. It may be that—as is the case in motor tasks—a large proportion of the intertrial variation not only in the behavioral response [102], but also in the BOLD signal [101] is explained by fluctuations in underlying RSNs.

Eliciting emotions in the environment of an imaging scanner remains a highly artificial process. This raises the question as to the predictive value of current neuroimaging data for explaining the emotional modulations in real-life contexts. This is particularly important for applied areas such as psychotherapy and coping research . Thus, in addition to identifying the neurobiological basis of emotional regulation behavior, the generalizability or predictive validity of imaging data for real-life emotions should be systematically evaluated.

5 Conclusions

Neuroimaging has replicated and extended earlier findings of neuropsychological studies in brain damaged subjects. It has significantly contributed to unraveling the organization of neural systems subserving the different components of emotional stimulus-response mediation along the neuraxis in healthy human subjects. Improved operational definitions and paradigms have contributed to differentiating subcomponents of emotional functions such as, for example, perceptual decoding, anticipation, associative learning, awareness, and response mediation. However, despite obvious advances, a comprehensive model integrating the diverse emotional behaviors on the basis of involved cerebral mechanisms is still unavailable. Moreover, the interpretation of findings is complicated by technical and methodological difficulties.

Research advances not only depend upon the technical refinement of imaging methodology but also on the improvement of behavioral procedures and measurement models. Neuropsychological construct validation procedures imply that an increase of localization precision of the imaging technology would also require an enhanced precision on the side of behavioral operationalizations. However, this seems not to be case as many studies still use unsophisticated stimulus materials or global instructions involving multiple or undefined subfunctions. As much as relatively global operationalizations are applied, however, the obtained neuropsychological correlations (for example, regarding activations of the PFC) will remain incomprehensible.

We have suggested here the framework of a lense-type assessment model , wherein activations in well-characterized neural structures may be used as predictors of particular emotional processes. According to this, a hierarchy of latent constructs constitutes the behavioral level, an idea, which is largely accepted in psychology. On the level of brain activity, patterns or families of topographically distinct activity can be identified in a similar way and used as a predictor of behavioral function. Following the assumptions of a methodological parallelism, neuropsychological construct validation procedures make uses of this framework of activity–behavior associations on different levels of the hierarchy. It can be extrapolated from multivariate personality theory, that the prediction of behavior will only be successful if activation measures and psychological data are analyzed on a similar level of generality or aggregation.

In view of the complexities of emotional regulation behavior in human subjects, it is equally important to advance assessment theory, psychological conceptualization, and behavioral methodology [29]. Future work should therefore more closely inspect issues related to model construction, symmetry of neural and behavioral variables, and their aggregation levels. Multidisciplinary approaches that combine improvement in brain activation measurement with enhanced psychological data theory may thus foster construct validity, reliability, and predictive power of emotional neuroimaging.

Knowledge pertaining to the localization of brain activations and its functional connectivity is also an important input to inform and constrain cognitive theories of emotion psychology. Thus, insights from the brain will thus help to explain the incoherences of psychophysiological, behavioral, and subjective indicators of emotion that are so frequently observed in psychophysiological studies. Activation data may also help to establish models that possess a better “breakdown compatibility,” that is, power to predict behavioral change as a consequence of brain damage.

The introduction of structural/connectional and functional data has considerably bolstered scientific construct validation processes in the affective neurosciences and emotion psychology. Topographically distinct activity patterns are increasingly identified that possess a certain incremental validity, that is, an increasing power to predict the individual dynamics of emotional regulation behavior. Establishing a representative and valid model of emotional functioning is a necessary precondition for many areas of application such as the categorization of patients with emotional disorders and the assessment of psychotherapy.

Greater attention to methodological issues may help to bring more rigors to experimentation in the field of emotional neuroimaging, promote interdisciplinary research, and alleviate cross-laboratory replication. A wealth of approaches have been presented to countering BS loss in the amygdala, many of which are available as standard on commercial scanners or simply require the adoption of suitable imaging parameters [117, 125, 134]. Also, in the absence of a measurement theory that describes validated procedures or instruments for assessing emotional constructs, single findings cannot be trusted. Although absence of validation is acceptable for early stages of the research cycle, current emotional neuroimaging work has only just begun to approach the confirmatory stage. To establish confidence in the suggested models, additional efforts are required to empirically validate assessment strategies and instrumentation.