1 The Rise of Music and Emotion Research

The question of how music induces emotions in listeners with such effortless grace is a puzzle worth solving not only for its myriad real-world applications, but also for its ability to address the fundamental reasons for the existence of music [29.1]. Emotions induced through music are intrinsic to most ceremonies [29.2] and may be used therapeutically [29.3], but research has not yet fully dissected the factors contributing to such effects and uses of music; in fact we are neither sure of the ways in which music engenders feelings, nor do we agree on the typology of emotions relevant for most musical episodes. Extensive summaries of music and emotion have been provided in recent dedicated books and journal articles [29.4, 29.5, 29.6, 29.7]. This chapter not only reviews the topics in which consensus has emerged, but also attempts to combine some of the key themes that have perhaps become lost amongst the increasing diversity of theories and foci of research.

There has been a long-standing interest in the emotional pull of music. From ancient Greece (Aristotle, Plato) to the Enlightenment (Rousseau, Baumgarten), philosophers have attempted to account for this pull, as have the fathers of evolutionary thought (Charles Darwin) and contemporary psychology (Wilhelm Wundt). While the empirical study of emotions has been proceeding now for more than 100 years, the rise of its popularity in music research coincided with the cognitive paradigm loosening its grip on the range of permitted topics in the early 1990s. From this point on, softer topics such as emotions were allowed to flourish after several decades of focus on human information processing. Music research always had an eye on the emotions despite their unpopularity in psychology, and exceptional individuals had presented their views on the subject. For instance, Kate Hevner discovered the low-dimensional structure of affects in the 1930s, fifty years before James Russell [29.8] proposed the well-known affective circumplex model. Later, Leonard B. Meyer connected emotions to expectations [29.9], Paul Farnsworth developed a new emotion vocabulary [29.10], and Daniel Berlyne redefined the topic as a mission to find the objective properties of the stimulus that would relate to arousal [29.11].

The 1960s and early 1970s saw research that rediscovered the conceptual cartography of the emotional landscape in music, with Rigg [29.12], Wedin [29.13], and Gabrielsson [29.14] asking listeners to provide free verbal reports and adjective choices for the emotions represented by music examples. This line of research aimed to formulate a definitive emotion taxonomy for music, which even now has not been fully resolved. In the 1980s and 1990s, the field focused more on methodological innovations such as continuous ratings of emotions [29.15], and on specific emotional responses such as thrills [29.16] and strong experiences [29.17].

In the field of affective sciences, the 1990s witnessed a frenzy of activity when influential neuroscientists such as Antonio Damasio [29.18] and Joseph LeDoux [29.19] outlined their seminal theories of emotions. The neuroscientist Jaak Panksepp [29.20] soon applied these ideas to music, quickly followed by others [29.21]. The culmination of research up until the turn of the millennium was collated in the first handbook of the field [29.22], which solidified research terms and concepts. However, in the period that has followed, neither the field of affective sciences, nor the smaller subfield of music and emotion studies, have provided full answers to all fundamental issues.

This chapter comprises questions of what and how concerning emotions and music. The first section inquires into what the perceived and experienced emotions are; here the notions of core affects, basic emotions, and complex emotions are discussed. This is followed by a section exploring how, which offers a summary of mechanisms considered relevant for the emotions. Mechanisms are covered in parallel to emotion structures, and the two complementary aspects of emotions – perception and experience – are discussed within a uniform interpretative framework. In the final section, the current challenges are reviewed and discussed.

2 Structure of Emotions

To summarize the current state of research in music and emotions, a few definitions are first in order. The terminology used in the field is diverse, but the concentrated effort that created a dedicated handbook [29.4] has remedied the situation considerably. Affect is the broader domain encompassing emotions, moods, and feelings. Emotions, which are typically the focus of attention in this field, are distinguished from moods by their relatively short duration and moderate to high intensity, whereas moods tend to be longer and less intense than emotions. However, it has been suggested that this distinction is rather blurry and exceptions have been pointed out [29.23]. Feelings refer to the subjective component of the emotion whereas arousal is the physical activation of the autonomic nervous system. Under these categories, there are specialized emotional and physical reactions, such as chills or goosebumps, and, in addition, strong emotions, which can encompass a range of musical experiences [29.5]. Emotions have the capacity to span a broad range of topics such as motivation, preference, intensity, and affect reactivity [29.24], but these are rarely addressed in the context of music and will not be covered in this chapter.

Although consensus on the topic of emotions has been rather elusive [29.25], the main subcomponents of emotions are largely agreed on. These include (1) appraisal (one assesses a situation to be dangerous), (2) expression (one screams), (3) autonomic reaction (one starts to perspire), (4) action tendency (one moves away from the situation), and (5) feeling (one feels threatened), all of which occur more or less simultaneously ([29.26, pp. 6–8], [29.27]). Purely cognitive processing is not usually considered an emotion since all theoretical positions assume some form of arousal to distinguish emotions from processing of information [29.28]. The components occur at conceptually different levels and indeed, the outcomes of these processes are often described at different levels: physiology (changes in brain, hormonal or autonomic nervous system states), psychology (functions, appraisal, or recognition processes), or phenomenology (emotions as experienced). In this chapter, the levels are distinguished in terms of the processes related to sensation, recognition and experience of emotions and this division shapes the models of emotions and mechanisms proposed at each level.

The aim here is to bridge the gap between what are now largely different avenues of research: the emotions perceived in music (also known as emotion recognition or expressed emotions) and emotions experienced or induced in the listener. The former was an early focus of the field [29.13, 29.29, 29.30]. In such studies, listeners were often instructed to describe the music in emotional terms (e. g., this music is sad) or describe what the music might have been expressing (e. g., this music expresses sadness). During the last decade, the emphasis has shifted towards explorations of how music makes the listeners feel. The distinction between recognized and induced emotions has been tempered by solid evidence of the way in which the two often overlap. Empirical evidence suggests that the distinction should be characterized as a question of intensity [29.31] rather than involving completely different processes, despite work considering the latter [29.32]. Although the perception and experience of emotions are intrinsically linked, increasingly refined theorizing and the proliferation of practical examples have taken the two categories in different directions. Here, instead, the intention is clarify the similarities between emotion recognition and induction processes in music by bringing the two into a framework that contains different levels of affects as well as series of mechanisms capable of producing them. This framework also attempts to locate the mechanisms within these different levels of affects.

The theoretical review of emotion models is organized with three explanatory levels of affects, starting from low-level core affects, proceeding to basic emotions, and ending with high-level, complex emotions. The first two levels refer mainly to processes involved in perception of emotions whereas the last one refers to induction of emotions, although this division risks oversimplifying matters. To help the reader interpret and keep track of the affect levels and how they are connected to mechanisms discussed in the next chapters, a schematic illustration of the key concepts is given in Fig. 29.1.

Fig. 29.1
figure 1figure 1

Schematic illustration of affect concepts (levels, models, and mechanisms)

The left part of the illustration concerns the question of what the pertinent emotions are. Affect levels refer to distinctions between low-level sensed emotions (core affects), perceived emotions, and experienced emotions. This division corresponds approximately to the models of emotions conceived to date. The organization of affect levels as having low-level measurable properties capable of producing highly different conceptual interpretation is influenced by the hybrid model of emotions proposed by Lisa Feldman Barrett [29.33, 29.34]. In this model, the underlying physical machinery is best described by the dimensions (core affects) but the conscious interpretation of these is categorical, and influenced by the conceptual categories people have for emotions. The synthesis proposed here (Fig. 29.1) assigns different mechanisms of emotions to different affect levels. This dynamic model claims that the way people use conceptual knowledge determines how they feel, and, due to variance in context, differences between individuals, and effects of high-level mechanisms, there is increasing variation in emotions at the highest complex and experiential level.

In the affective sciences in general, emotions have been theorized as organized along a few core dimensions (i. e., core affects), belonging to discrete categories (basic emotions), or having a complex, perhaps even a domain-specific structure. Each of these major descriptive schemes is applicable to music and will be explained in the three sections that follow.

2.1 Emotion Dimensions and Core Affects

A popular way to conceptualize emotions is to divide emotions into a continuum between positive and negative emotions. This dimensional approach, rooted in the work of Wilhelm Wundt [29.35], is best known as the circumplex model of emotions [29.8]. This bidimensional model assigns emotions as a mixture of two core dimensions, valence and arousal (hyphenated with A and V in Fig. 29.1), that represent two orthogonally situated continuums of pleasure-displeasure and activation-deactivation in the affective space. The circumplex model has received support in large studies of self-reported emotions [29.36], cross-cultural comparison [29.37], and psychometric studies (reviewed in [29.38]) and has thus been used in a large number of emotion studies in psychology as well as in systematic musicology. Russell and Feldman Barrett [29.34] have characterized the dimensions as core affects, to differentiate theirs from other dimensional models. The term core affect refers to the idea that such affects arise from the core of the body and in neural representations of body states. Core affects are assumed to present already in infants [29.39] and have psychological universality [29.40, 29.41].

Another way of expressing the dimensions of emotions has been to divide them into approach- versus avoidance-related emotions [29.42]. There have also been influential variants of the circumplex model such as the rotated circumplex model by Watson et al. [29.43] (titled Positive and Negative Affect Scale, or GlossaryTerm

PANAS

). Thayer [29.44] reorganized the arousal dimension into energetic arousal and tense arousal on the basis that separate psychobiological systems are responsible for energy and tension. Both formulations have been successfully applied to research on emotions in music [29.45, 29.46], but they remain, nevertheless, fairly unpopular in music research.

The main problem for any dimensional model is that emotions can sometimes be ambivalent [29.47, 29.48]. It is possible to feel both happy and sad at the same time, and indeed, empirical studies using tragicomic films [29.49] or music [29.31] seem to suggest that mixtures within the dimensions are relevant for emotional experiences. This problem is mitigated by postulating that such mixed emotions are co-occurring combinations of different activities within a dimension, or even rapid switching between the two [29.50].

2.2 Basic Emotions and Emotion Perception

Perhaps the most widely known way to organize emotions is to assume they are discrete categories, also referred to as basic, primary, or even fundamental. Such theories posit that all emotions can be derived from a limited set of innate and thus universal basic emotions, such as fear, anger, disgust, sadness, and surprise [29.51]. The actual number of categories and the label for the categories is still a source of debate, but the basic emotion model has gathered support from neural, cross-cultural, developmental, and physiological research spanning four decades [29.27]. Despite the popular appeal of such categories, doubts about their explanatory power has been raised, as, for instance, brain imaging studies have not yet delivered results consistent with the innate and distinct emotion categories [29.52]. In music, almost half of the studies focusing on emotions experienced have resorted to basic emotions [29.53]. This popularity is related to the fact that the categories are easy to use in emotion recognition studies, which are popular in developmental [29.54, 29.55] and production research [29.56, 29.57]. Similarly, when physiological or neural correlates of music-induced emotions are explored, basic emotions tend to be chosen [29.58, 29.59, 29.60, 29.61, 29.62], although this has started to change due to availability of emotion taxonomies made specifically for music-aroused experiences (described in the next section).

The dimensional and basic emotion models offer distinct ways to tackle musical emotions. However, when the two models are mapped onto each other in emotion expressed by music, the results have suggested that the models overlap considerably [29.63].

2.3 Complex Emotions and Emotion Experience

When people engage with artworks, or objects in nature, the emotional experiences are not easily explained as dimensions or discrete patterns of survival emotions [29.64]. Any fiction, including music, does not have direct material effects on the physical or psychological wellbeing of the individual in the same way that events in everyday context do. This freedom of fiction to explore the expanses of the human mind rather than provide handy heuristics for reaction to the environment in everyday activities may expand the scope of possible emotions in fiction. It is perfectly plausible to think that emotions induced by music – or any art in general – are more contemplative, reflective and nuanced, or if we need to use one word – complex. A similar argument has been put forward to other complex emotions such as moral, social, or epistemic emotions that are not necessarily involved in everyday survival [29.27]. Such complex emotions are also more subject to cultural and social interpretations than basic emotions [29.41, 29.65].

In music, the desire to explain complex emotions has often been to topic of musicological writings [29.66, 29.67], but the empirical mapping of the topic began in the late 1960s, when Kaarle Nordenstreng [29.68] analyzed the structures underlying listener experiences across musical excerpts. A decade later Edward Asmus [29.69] carried out a large-scale study (n = 2057) of the affect terms relevant for music and proposed nine dimensions of affects accounting for the variety of emotion states. The decisive attempt to account for the emotional experiences induced by music was taken by Marcel Zentner and his colleagues [29.70]. They started with a comprehensive list of emotion terms relevant to music and validated these with iterations of surveys, and finally administered the refined lists to a festival audience. Next, the similarity between the terms was able to reduce the list into ten solid factors, which was further validated with separate ratings and confirmatory factor analysis. This final stage resulted in nine emotion factors, a model which is now known as GEMS (GlossaryTerm

GEMS

). The model emphasizes positive emotions (at least seven out of nine factors in the model) and provides factors particularly appropriate for contemplative emotions such as wonder, nostalgia and transcendence. For this reason, it fits well with the tradition of complex and aesthetic emotions, and it has been widely adopted in studies of emotional experiences related to music [29.71, 29.72]. It is worth noting that other proposals for the appropriate emotions induced by music have been offered as well, based on equally large samples of participants [29.73]. This proposal shares many of the factors of emotions such as nostalgia, being moved, and final differences may just be semantic and the final number of broad categories included.

To summarize, in this section I have discussed the previous models of music-related emotions and presented a new synthesis, outlined in Fig. 29.1 . In this scheme, the low-level dimensional representations of core affects can be collapsed into basic emotion categories in perception of emotions using specific, well-known labels (e. g., happy, sad, etc.). However, at the level of experienced emotions, the conceptual act of labeling emotions is fundamentally modified by high-level mechanisms and modifiers as well as by language and culture. Therefore, the experiences as well as the labels applicable are different in emotion perception and experience. This fundamental difference between emotion processes and actual experience is of crucial importance here [29.33]. This perspective may explain some of the dissatisfaction scholars have had with the models designed to explain emotions as processes and objects of recognition (core affects and basic emotions) whereas others have declined to acknowledge the existence of music-specific emotions. The champion for the complex emotions is the GEMS model, but the differences between the complex and basic emotions becomes less of an issue if the differences in the processes are fully acknowledged.

3 Mechanisms and Modifiers of Emotions

The reasons and mechanisms of how a particular piece of music might express emotion or evoke particular emotions in listeners is not yet completely understood, although a coherent theoretical framework for experienced emotion has been proposed by Patrik Juslin and Daniel Västfjäll [29.74]. The individual mechanisms in this framework (dubbed as GlossaryTerm

BRECVEMA

according to eight mechanisms in a recent update, [29.75]) will, in this synthesis, be divided into four fundamental processes (physiology, embodied, memory, and appraisal), which are assigned to the different levels of affects described earlier. The purpose is to better acknowledge the interlaced nature of sensed, perceived and experienced emotions, and the thus emotion mechanisms follow the logic of the previous chapter and the diagram outlined in Fig. 29.1.

This reorganization also implies that the boundary between perceived and experienced emotions with respect to emotion mechanisms are redefined, since the BRECVEMA framework only applies to experienced emotions. However, several of the low-level mechanisms (e. g., contagion, entrainment) are appropriate also for emotion perception and by bringing these two closely related processes to the same scheme, the mechanisms previously applied only to induction may help to understand emotion perception principles. This decision is also motivated by the current lack of suitable mechanisms to explain the way emotions are mapped across domains (e. g., visual, auditory). In such processes, the concept of contagion is actually dependent on recognition of the actions giving rise to emotion experiences.

Again, we start with the low-level mechanisms for emotion perception before moving on to higher-level mechanisms of emotional experiences (the right side of Fig. 29.1). The issues that are known to modify the emotion processes in music are also incorporated into this summary of mechanisms.

3.1 Mapping Mechanisms of Emotions

In this synthesis of the mechanisms, the BRECVEMA framework can be interpreted to have two low-level mapping mechanisms. These are mapping mechanisms between sound and emotion, as direct transfer – mapping – between sound and affect seems to take place. Two categories of mapping mechanisms can be distinguished. The first of these, labeled as orientation mechanisms, consist of the brain stem reflex, expectancy and entrainment. The brain stem reflex is hardwired attention response that is activated by any exceptional – loud, sudden, sharp, accelerating – sound, which not only directly influences core affects (e. g., increase in physiological arousal) but is also recognized as surprise and may be experienced as unpleasant or exciting [29.76]. Another orientation mechanism is expectancy, where any violation (pitch, tonal, rhythmic) of the expected musical structure creates an orientation response at the core affect level, which may be interpreted in different ways; violation of the expectations has been observed to lead to anxiety [29.76], surprise [29.77], or even to thrills [29.78]. Entrainment refers to adjustment – though not necessarily synchronization – of the internal oscillators to rhythmic periods in the music. Entrainment itself may be conceived as a multifaceted phenomenon since it can refer to perceptual, motor, physiological, or even social entrainment [29.79], which may each have different implications for emotions. For instance, motor entrainment has been shown to modulate pleasantness [29.80], but social entrainment has been judged to induce feelings of being connected [29.81]. The entrainment mechanism also facilitates orienting and attending to information in music. For this reason, entrainment acts as an attentional focus that may be capable of enhancing the delivery of other mechanisms and emotions in music. In summary, these three orientation mechanisms (brain stem reflex, expectancy, and entrainment) act as strong guides for perceptual processes and attention, are considered to be low-level processes, and often lead to changes in core affects (arousal in particular).

A second set of relatively low-level mechanisms, including contagion and visual imagery from the BRECVEMA framework, are in the synthesis presented here (Fig. 29.1) defined collectively as embodied mechanisms. These are embodied in the sense that they refer to reactivation of past sensory and motor mechanisms [29.82, 29.83, 29.84]). This perspective considers that the body plays a major role in all interactions with environment, and that any response to stimuli will be based on simulations, or reenactments, of others' nonverbal expressions and affective states [29.85]. Such mappings arise within expressive channels, such as matching another person's facial expressions or vocal intonations [29.86], but also across channels (between audio and visual [29.87]).

What makes contagion as a mechanism particularly relevant for embodied explanations is that it is assumed to consist of a process in which the listener perceives the emotional expression of the music, and then mimics this expression internally [29.74, p. 565]. Taken further, this must mean that most basic emotions are linked to specific bodily reactions and motor patterns that have been influenced by the production or causal results of experiencing that emotion. Because of built-in codes based on states reflecting the emotional experiences or its output (movements, expression, sounds, gestures of such states), an internal mapping between an external stimulus and our representation of the possible cue combinations is possible in contagion. If we take the example of fear, the peripheral nervous system in fear is associated with increased levels of glucocorticoids and norepinephrine [29.88] that lead to states of alarm that quickly raise autonomic arousal, which in turn leads to louder vocal output, higher pitch, brighter timbre and faster movements than in neutral state. These sets of cues, which are very similar in music and speech [29.89], are grounded in physical changes caused by underlying emotion states, in this example by fear and an ensuing state of stress, which affects said individuals' vocal expression, posture, facial expression as well as movement. Knowledge of this multimodal code allows us to decipher the intended emotional expressions, and this knowledge is assumed to be implicit and accessible by embodied simulation (what would I feel like if I sounded like that …). Even when merely activating some of the motor programs used in simulations, we may catch the same emotions, or at least, our emotional reactions may be amplified [29.82].

Another mapping mechanism, visual imagery, may also be associated with embodied processes, since cross-modal correspondences are thought to occur in music [29.90] and in other domains [29.91, 29.92]. Visual imagery is believed to generate emotions through such correspondences. Whilst the exact mapping in this mechanism has not yet been pinned down, it is assumed to be related to metaphoric representations [29.93] that are tightly coupled with physical dimensions [29.94].

3.2 Evaluative Mechanismsof Emotions

In addition to the low-level automatic mechanisms of emotions, a set of high-level, largely conscious and intentional mechanisms are included in the BRECVEMA framework. In the synthesis, these are cataloged into memory and appraisal mechanisms. The BRECVEMA framework includes two mechanisms related to memory, episodic memory and evaluative conditioning. In the latter, a repeated pairing of a particular music or sound example with either a positive or negative stimuli or outcome leads to a conditioned emotional response. This can be either automatic and unconscious or conscious. In music, leitmotifs associated with particular heroes and villains may be the best example of this mechanism, and empirical work on conditioned responses to sounds have demonstrated the malleability of such conditioned responses [29.95]. Episodic memory as a mechanism for emotions refers to a recollection of a specific event prompted by the music. This might lead to an emotional experience that could be strikingly different from the emotion expressed by the music. For instance, in recent study, a sad musical expression was turned into happy emotional experience by inserting a short quotation from the Star Wars theme into the sad solo cello piece [29.76]. This mechanism, known also as the Darling, they are playing our tune phenomenon [29.96], is a potent mechanism for emotions, as autobiographical memories have been identified as the most frequently listed motive for listening to music [29.73], and, of course, the most essential mechanism for nostalgia [29.97].

Another set of mechanisms, here referred together as appraisal mechanisms, consists of three distinct categories: cognitive appraisal, identity confirmation, and aesthetic judgment. Cognitive appraisal is a process in which emotion is the cause of an event regarded to have significant implications for the goals of the individual [29.98]. Though this mechanism is not explicitly included in the BRECVEMA framework, it does function as a bridge between the appraisal theories of emotion [29.100, 29.99] and their successful use in explaining responses to arts [29.101]. Moreover, the appraisal processes, such as novelty check, goal relevance, goal congruence, and coping potential, can be considered powerful moderators of emotions [29.102]. Mechanisms labeled as identity confirmation, although not put forward as a category in the BRECVEMA framework, refer to the potential powerful social effects of music as carrier of self-identity [29.103, pp. 73–74] (see also [29.104]). Juslin et al. [29.73, p. 190] found this mechanism to be the second most important explanation of emotional episodes associated with music, yet it remains currently unexplored. The third appraisal mechanism is aesthetic judgment, which is simply an evaluation of the aesthetic value of music. Such evaluation may comprise initially of an aesthetic attitude [29.105], but may also comprise multiple criteria, including beauty, skill, novelty, and artistic intention [29.75]. This recent extension to the BRECVEMA framework is considerably broader than the other mechanisms.

To recapitulate the sections on affect models and mechanisms, let us go back to the theoretical synthesis (Fig. 29.1) once more. Here I will take one of the most ubiquitous hits of music and emotion studies as an example, the Adagio in G minor for Strings and Orchestra by Tomaso Albinoni, which has been used at least in 16 published studies, despite the probable and dubious associations such frequently heard musical piece might carry. In terms of core affects, it is fairly clear that the slow tempo, soft timbres, low dynamics, legato articulation, and gradually descending melodies lines mimic low arousal state, and perhaps even negative valence due to multiple cues consistent with sadness [29.106]. In this case simple mapping mechanisms such as contagion, entrainment, and expectations support interpretation of the music as calm and low-arousing, due to musical qualities such as predictability and ease of synchronization. Core affects could be measured through psychophysiology, self-reports consisting of dimensions, or by asking the listeners to rate what the music expresses in terms of basic emotion categories. This would most likely lead to the assessment that the piece expresses sadness [29.107], and perhaps tenderness. No surprises here, although such ratings of perceived emotions would be subject to minor modifications based on current mood, situation, personality traits and music preferences.

However, it is informative to consider what happens when attention is shifted to the emotions experienced when listening to the Adagio. In this case, appraisal and memory mechanisms kick into play, which may lead experienced emotions to differ markedly from perceived emotions. For example, somebody hearing the Adagio for the first time in a particularly receptive situation (say, in a cathedral when attending a memorial service) might experience the emotion as being moved, whereas another listener highly familiar with the piece might experience nostalgia due to fond recollections of past performances of the piece. For others, the appraisal mechanism related to aesthetic judgment might steer the experience towards feelings of wonder due to the sheer beauty of the piece and aesthetic appreciation of the classical music canon. These trajectories of how the emotional experiences and their labels may systematically – but not arbitrarily – change across the levels of affects are illustrated in the schematic outline (Fig. 29.1). It is worth pointing out that many of the shifts between perceived and experienced emotions caused by the memory and appraisal mechanisms lead to more positive experiences. This positive pull is likely to be related to the fact that music does not have an obvious material effect on the listener's wellbeing; it is voluntary activity that offers a medium in which listeners can safely project their emotions. Taken further, it also implies that the experience of intense emotion of any kind may be inherently pleasurable, as long as the emotions do not come with any real-life consequences attached. The best example of the positive pull is sad music, easily recognized as such by the listeners [29.108], but usually experienced as positive, perhaps involving feelings of peace, tenderness, nostalgia, or being moved [29.53, 29.72].

The purpose of the concurrent presentation (Fig. 29.1) of affect levels and mechanisms in the present synthesis is to emphasize that the mechanisms together with the modifiers may lead to entirely different experiences, and that core affects, perception of basic emotions, and experienced emotions represent different yet connected explanatory levels of these processes. While the mechanisms themselves undoubtedly have a major influence on shaping the emotions, there are other factors that are known to influence emotions in particular ways, here termed as contextual modifiers, which will be described in the next section.

3.3 Contextual Modifiers of Emotions

Emotions that are experienced are more likely to be influenced by matters relating to context , whether it is the music, listener, or situation. This idea has been formally expressed by Scherer and Zentner [29.109] as a multiplicative function between structure of music, performance, listener, and context. The multiplicative nature of this scheme has not been directly tested, which suggests that the entire topic of modifiers has been considered of secondary importance in music and emotion research. Nevertheless, some contextual modifiers have been explored.

Starting with the music itself, music and emotion studies have mainly been conducted in Western art music contexts, utilizing highly educated Western listeners in particularly restricted situations (mainly laboratory settings, see [29.53]). The context created by music itself, its genre, lyrics, and cultural connotations, is perhaps the most obvious modifier of emotions. Musical devices expressing emotions vary across musical genres, periods and cultures. Despite the apparent differences in musical materials, the role played by culture in musical emotion has been considered only a modifier on emotions since it has been shown repeatedly that the basic emotions in music can be recognized across cultures [29.108, 29.110]. However, complex emotions and experiences appear to be more dependent on cultural knowledge since they rely on aesthetic judgments, memories, and identify formation, which all require learning and exposure to the music. This raises a more fundamental issue, which is that different genres of music have widely different functional uses. In the present synthesis, an attempt has been made to capture this issue, through the notion that situations may alter the perception and experience of emotions, and that this may be more fundamental than a mere contextual modifier. Not all emotion concepts are relevant for all types of music, and for this reason even some of the core affect dimensions (e. g., valence) have been shown to be problematic when applied across different music genres [29.111]. Moreover, when large stimulus sets provided by social tagging of music are harnessed for computational analysis of emotional expression, it has been found that contextual information such as genre brings significant improvements to prediction of emotions [29.112]. Contextual information is also known to affect perception and experience of emotions. For instance, extra-musical information intensifies [29.113], lyrics both enhance and subdue [29.114], and images amplify [29.60] the emotional experiences aroused by music.

Individual traits, moods, expertise and preferences of the listeners also modify emotions. At the broadest level, culture does contribute to the emotions since traditions, customs, and musical genres are byproducts of subcultures. As most music and emotion research has been conducted on Western listeners, there is a need to expand the sphere of investigation outside this realm (e. g., [29.115]). Of the other modifiers to emotions, musical expertise has been suggested to either amplify [29.116], or have little effect on both perceived [29.117] and experienced emotions [29.70]. Personality traits have been shown to have a small but consistent moderator role for both experience of emotions [29.118] and perception of emotions [29.119]. Music preferences and motives for listening to music also have a small but systematic impact on emotions [29.116, 29.73].

A number of situational factors seem to affect perception of emotions in music but many of these are particularly important for experiencing emotions. Everyday music listening studies [29.120, 29.121] have shown that variations across listening contexts – whether at home, at a laboratory, on public transport, or with friends – have an effect on which emotions are likely to be experienced. These differences may not relate only to the situations themselves, but also to differences in emotion regulation goals afforded by the situation. For instance, a gym affords increase in arousal whereas a church has an entirely different set of functions, not only because these places differ acoustically. Finally, the most complex situational modifiers arise from the social dynamics of the situation. For instance, whether a listener is alone or in a group [29.122], and the opinions held by others in the group, both of which are known to affect emotions aroused by music [29.123].

4 Measures and Musical Materials

4.1 Self-Report Measures of Emotions

Most of the research on music and emotions relies on self-report measures such as Likert ratings and forced-choice designs meant to capture the emotions representing the affect models (dimensions, basic, or complex). There are numerous standardized instruments for collecting mood and emotion evaluations (such as Positive and Negative Affect Scale – GlossaryTerm

PANAS

, profile of mood states – GlossaryTerm

POMS

, and self-assessment manikin – GlossaryTerm

SAM

, or differential emotions scale – GlossaryTerm

DES

, etc.), which have been reviewed in detail elsewhere [29.124]. All self-reports measures are subject to caveats and limitations. These often relate to the confusion between perceived and experienced emotions, which can be difficult for participants to dissociate, or more generally to the demand characteristics of such methods, in which participants are biased towards complying with the inferred outcome. Despite these caveats, the standardized measures are efficient and convenient, but may either assume too much about the underlying experience or rely on specific semantic labels. Therefore it is also useful to probe experiences with open responses [29.125], interviews [29.103], and nonverbal measures such as similarity ratings [29.117]. Another way to qualify the emotions experienced is to collect peripheral and indirect measures of emotions.

4.2 Peripheral and Indirect Measures of Emotions

Peripheral measures of emotions, such as skin conductance response (GlossaryTerm

SCR

), heart rate variability (GlossaryTerm

HRV

), facial electromyography (GlossaryTerm

EMG

), respiration, and temperature have become increasingly common in studies involving experienced emotions [29.126, 29.127, 29.76]. These indicators are well established in terms of the underlying physiology and emotional correlates (reviewed by [29.128]). In research focusing on strong responses to music, it has become customary to record chill reactions [29.129, 29.130]. Such physiological measures are not always sensitive enough for the emotional experiences examined [29.131, 29.59], although they do track arousal adequately [29.127]. In addition, there are other behavioral ways of discovering whether the emotional experience is actually taking place, including indirect measures and reaction times, both of which rely on the fact that emotional experience biases cognitive judgment in a systematic fashion. For instance, a sad individual interprets ambiguous faces to be more negative and processes incongruent (e. g., happy) information more slowly than a nonsad individual. Therefore, indirect measures can be used to assess whether listeners are experiencing a negative or positive emotion [29.113], despite the considerable limitations such sensitive measures have. The three levels of affect outlined earlier (Fig. 29.1) are not all accessible with peripheral measures of emotions. Core affects may be measured with physiology, and with some caveats, the basic emotions might serve as good targets of physiological measures if the question is really about emotional experiences rather than emotion perception. Measuring complex emotions with physiology may not be feasible unless one is interested in a very specific type of emotional reaction such as chills or extremely strong, aesthetic reactions.

4.3 Neural and Endocrine Measures of Emotions

Measuring brain activation has become increasingly common in emotion studies, although the purpose is often not to verify the emotions experienced or perceived, but rather to pinpoint when and where processing take place. In this, electroencephalography (GlossaryTerm

EEG

), magnetoencephalography (GlossaryTerm

MEG

), and functional magnetic resonance imaging (GlossaryTerm

fMRI

) are the key techniques, with EEG and MEG indicating electric activity, and fMRI indicating blood flow and blood-oxygen level, both of which index the underlying brain activity. fMRI is accurate in terms of the location of the neural activity but imprecise in timing. Therefore, fMRI studies have revealed which areas are involved in emotion experiences [29.130, 29.132] and perceived emotions, in terms of stimulus valence [29.133, 29.134, 29.135]. However, discovering where emotional information is processed is not the main aim of such studies as the real insights come from functional explanations of what each area of the brain area is responsible for. Such explanations allow us to associate the areas in question to plausible mechanisms. For example, a part of the limbic system called the amygdala, which itself has further specialized areas, is known to be relevant in detecting dangers and fast processing of visual information, as well as being able to code the core affect of arousal [29.136]. In neuroscience studies of music, the amygdala has been shown to be heavily involved in processing of strong emotions, such as chills, but also in strong negative stimuli, demonstrating that it captures the intensity of the stimuli [29.137]. In addition, the anterior cingulate cortex (GlossaryTerm

ACC

), an area involved in autonomic activity, is often implicated in music and neuroscience research. This could be taken as another index of core affects, but such activations may take place due the involvement of this area with movement and motivation.

Other brain areas such as ventral striatum and anterior insula have been associated with pleasure induced by music [29.138], which in more general terms has been connected to reward-related brain areas that regulate the release of dopamine in the brain (demonstrated by [29.139]). In research connecting brain areas to emotion mechanisms, the amygdala and ventral striatum have been associated with low-level mapping mechanisms as well as the outcomes of the emotions via higher-level mechanisms. In addition, there are more specific areas related to memory mechanisms such as the hippocampus. The anterior hippocampal formation has been observed to display activity changes in a numerous music studies [29.132, 29.140], and this area has a crucial role in learning and memory, as well as in expectation and the detection of novelty [29.141]. However, this is not to say that only musically induced emotions generated via memory mechanisms (episodic memory, evaluative conditioning) necessarily show hippocampus activation, as the hippocampus also serves as a switchboard between cortical and subcortical areas, and is thought to be involved in positive social emotions, and even in the experience of being moved [29.142].

In addition, studies of brain-damaged patients that demonstrate selective emotional impairments can reveal which functional brain areas are necessary for emotion perception and experience [29.143]. Electrophysiological measures (mostly EEG) have typically investigated valence as a core affect in association with music [29.133, 29.144, 29.145]. The results have shown that positive and negative emotions produce different hemispheric lateralization of neural activity; the right hemisphere is involved more during negative and the left during positive emotions. However, the results from these studies fail to paint a consistent picture, since the results tend to be dependent on the type of techniques employed.

4.4 Musical Materials

The availability of good-quality audio is nowadays virtually limitless, but the number of excerpts used in studies is usually limited by the length and feasibility of the experimental task. In behavioral studies that focus on perceived emotions, a large number of stimuli could potentially be used. Despite this, there are only a few exceptional cases where the number of stimuli utilized in a music and emotion study is truly large: for example the several thousands of music examples used by Schuller et al. [29.146]. Online annotation schemes based on self-reports have typically produced larger datasets than laboratory experiments, with up to 500 music excerpts in some cases [29.147]), but this may be at the expense of audio quality, listener attention, and overall control.

Crowdsourcing is one way of harnessing the power of the masses, often through implementation of online annotation games [29.148, 29.149]. It has been suggested that crowdsourcing seems to yield data of similar accuracy to that generated by expert annotations [29.150], since the sheer volume of data compensates for variation between participants and evaluation settings. Another way of obtaining large amounts of ecologically valid information pertinent to perceived emotions is to tap into social tagging services such as Last.fm or curated databases such as I Like Music. The data in these services contain, among other things, user-defined tags for each music track. The tags may represent genres, preferences, and situations, but a significant proportion of them relate to emotions. For this reason, the collections of tags in these services have been used to build emotion models [29.112] and to connect the emotions to musical and acoustic properties of the music [29.151]. For music and emotion research, these massive datasets have tremendous potential for formulating predictive models, and also for connecting the everyday uses of music to situations, emotions, and individual profiles of listeners (such as age, gender, music preferences, musical expertise and personality). The validity of emotion structures inferred through analysis of such noisy folksonomies remains, however, to be determined in a rigorous comparison of these and carefully constructed psychometric laboratory studies.

5 Current Challenges

The topic of music and emotions in systematic musicology faces several challenges, many of which have been acknowledged for some time [29.152, 29.53, 29.75]. Perhaps the most important challenges to meet concern the widening of research contexts, both in terms of culture and music listening situation, as this could lead to significant revisions of the main concepts in music and emotion research in the near future.

5.1 Widening the Research Context

The majority of music and emotion studies have been carried out in Western countries and amongst particularly elite groups of people (affluent, young college students) using music from a narrow repertoire (classical, jazz, and film soundtracks). It is clear that the range of people under investigation should be broader. This could mean cross-cultural studies; however, such research is increasingly difficult due to the effects of continuing globalization. Nevertheless, useful tests of generalizability of findings can be performed within different cultural practices, regions, and subcultures found in the West, and by turning attention to representative rather than convenience samples.

Another challenge for the field is the applicability of the results to situations in which music is listened to. Most studies in this field have been conducted in laboratory settings [29.53], which is unlikely to be an environment conducive to complex emotional experiences. This is particularly the case if memory and appraisal mechanisms are tightly controlled, such as when listeners are exposed to unfamiliar music that does not fit their musical identities. Although such an artificial setting is not entirely problematic for studying perceived emotions, it is likely to suppress experienced emotions to some extent. One solution is to tap into the listening activities and emotion of listeners with the experience sampling method (GlossaryTerm

ESM

, [29.120]), which can now be directly incorporated into smartphones [29.121]. This method, coupled with relevant information about music, environment, individual features and current activities, can lead to significantly more realistic research settings.

5.2 Narrowing Down the Causal Influences

Only a small minority of music and emotion studies have attempted to establish causal links between musical features and emotions perceived or mechanisms and the ensuing emotions. In order to understand how low-level mapping mechanisms contribute to contagion, entrainment, or expectancy mechanisms, factorial manipulations of key musical features are indispensable. This line of research was started in the 1970s [29.30] but has not taken hold in the field despite promising studies on production [29.56], analysis-by-synthesis [29.153], and synthetic stimulus creation [29.154]. With the formulation of emotion induction mechanisms [29.74], cause and effect has been established between some emotions and mechanisms [29.155, 29.76], but a great deal more has to be done in order to explain how music is able to generate emotions through separate mechanisms at different levels of explanation. The synthesis provided in this chapter attempted to bridge the gap between the existing mechanisms of music-induced emotions, emotion perception, and core affects. The low-level processes involving recognition, physical characteristics and embodiment of emotions are necessary building blocks for high-level emotional experiences.

Most of the research in the field relies on retrospective ratings of emotions. Since both music and emotional experiences unfold in time, moment-by-moment fluctuations should be better incorporated into experiment design and theories. Although continuous self-reports have been used numerous times [29.129, 29.156, 29.157, 29.158], the approach has not yet provided full insights into the processes, perhaps because the modeling techniques are still unresolved. The most promising way to solve this problem seems to come from modeling the neural responses to music [29.159].

The topic of music and emotion has been undergoing a profitable expansion of themes, approaches and definitions during the last decade. As a topic for systematic musicology, it offers a rich interdisciplinary object of study that will benefit from close cultural and historical readings of the emotions in different eras, places, and subcultures, using methods ranging from strict laboratory experiments designed to tease apart the theoretical constructs, to experiments involving biological markers of emotions. Moreover, the technological advances that have altered the way music is being consumed and analyzed (such as music information retrieval), offer yet additional motivation and research tools to make progress on this topic in a transparent, empirical, and systematic fashion.