Introduction

The study of emotion is a traditional field of psychology (James, 1894). James’ (1890), p. 449) seminal theory suggests that emotions are perceptions of bodily responses. By placing the body at the centre of the emotional experience, James emphasises the importance of physiological arousal in the experience of emotion. Although many authors have explored the predictions of the Jamesian theory of emotion (Furtak, 2018; Haye & Carballo, 2017; Lacasse, 2017), further empirical progress has been prevented, mainly due to methodological issues. Specifically, experimental research about emotion is hindered by issues of ecological validity (Parsons, 2015), that is, emotion induced in laboratory settings is seldom comparable with that experienced in everyday life, lacking, among other things, in intensity and experiential content.

To deal with this difficulty, a number of procedures have been employed to elicit emotional states under controlled conditions. Referred to as mood induction procedures, these techniques include music (Västfjäll, 2001), films (Marcusson-Clavertz, Kjell, Persson, & Cardeña, 2019), self-referential statements (the Velten procedure; Kenealy, 1986), memories (Monnier, Syssau, Blanc, & Brechet, 2018; Schaefer & Philippot, 2005), tasks with manipulated difficulty (e.g. Mograbi, Brown, Salas, & Morris, 2012), and others (for a review, see Westermann, Spies, Stahl, & Hesse, 1996). Evidence has suggested that content such as films (Fernández-Aguilar, Navarro-Bravo, Ricarte, Ros, & Latorre, 2019) or images coupled with music (Zhang, Yu, & Barrett, 2014), are more effective in inducing mood. Arguably, this happens because in these procedures content is presented through multiple sensory modalities.

Regarding this, one important recent methodological advance is the use of virtual reality (VR) in research into emotion. VR is the term used to describe the computer-generated simulation of a three-dimensional image or environment that can be interacted with in a seemingly real or physical way by a participant. Interaction involves being able to manipulate objects or perform a series of actions by using special electronic equipment, for example, a helmet with a screen inside or gloves fitted with sensors. The illusion of reality is created by stimulating participants’ senses concurrently, including vision, hearing, vestibular sense and, in some cases, touch and proprioception (Chirico, Yaden, Riva, & Gaggioli, 2016). VR provides a number of advantages over previous mood induction techniques. Specifically, it allows immersive experiences in a safe environment, with experimental control over the content and the intensity level of the chosen stimuli. Additionally, as indicated, VR typically stimulates a variety of sensory modalities, leading to the integration of proprioception, interoception and sensorial information (Riva, Wiederhold, & Mantovani, 2019).

For these reasons, VR has had important applications to psychology. Particularly, it has been used as a tool in psychotherapy, in the treatment of patients that suffer from anxiety disorders, such as social anxiety or phobias (Carl et al., 2019; Peperkorn, Diemer, & Mühlberger, 2015), post-traumatic stress disorder (Beidel et al., 2019) and eating disorders (Riva, Gutiérrez-Maldonado, Dakanalis & Ferrer-García, 2019a). Nevertheless, VR has implications beyond treatment of clinical populations. For instance, a better understanding of VR induction of relaxation may help management of stress, one of the leading causes of physical and mental health problems (Adam et al., 2017; Toussaint, Shields, Dorn, & Slavich, 2016). Additionally, advances in emotion elicitation through VR may impact the entertainment industry (immersive films and videogames), as well as having implications for sports (simulations for athletes) and other professions (e.g. negotiators, doctors).

Although the clinical use of VR has been reviewed recently (Riva, Wiederhold & Mantovani, 2019b), a large number of studies have employed this technique with healthy participants, but this research has not been reviewed and integrated theoretically. The use of VR in emotion elicitation with healthy participants allows manipulations not used in vulnerable populations, also providing clearer conclusions for theoretical frameworks of emotions and applicability beyond management of patients. Accordingly, the primary purpose of this systematic literature review is to examine the effect of VR in emotion induction, exploring also the methodologies most frequently used in the studies, both to elicit and measure emotion, as well as the effectiveness of VR to induce specific emotions.

Methods

Four databases were used to search for articles, as follows: Web of Science, Science Direct, PsycInfo and PubMed. Relevant terms, including truncations and variations, were used in order to screen for suitable articles. The following terms were used: ‘Emotion* Induction’ OR ‘Mood Induction’ OR ‘Emotion* Valence’ OR ‘Mood’ AND ‘Virtual Reality’ OR ‘VR’ OR ‘Virtual Environment’.

This systematic review focused on studies inducing emotions in healthy participants. Other inclusion criteria were the study had to include empirical data collection, having at least one outcome to measure emotion (self-report, physiological or behavioural measure). Participants that had a diagnosis and studies using VR as a form of therapy were excluded. Only studies written in English were included, and all papers selected were published from 2008 to 2018.

A total of 1752 records were found by using the search terms. The studies were filtered by reading the title or abstract leading to the exclusion of articles due to lack of adherence to the main theme of the review. Duplicates found in each database were also removed. This led to a total of 86 articles that were fully assessed by reading the entire article. As the result of the reading 24 articles was excluded, not having met the inclusion criteria, 61 articles were left eligible for this review (Fig. 1).

Fig. 1
figure 1

Flow diagram showing the steps conducted during the literature search and selection process, based on the PRISMA model (Moher, Liberati, Tetzlaff, Altman, & The PRISMA Group, 2009)

Results

Summary information about the studies can be seen in Table 1.

Table 1 Summary of studies

Participants

The studies showed a wide range of sample sizes, from 10 (Brundage, Brinton, & Hancock, 2016) to 324 participants (McCall, Hildebrandt, Hartmann, Baczkowski, & Singer, 2016). The majority of the studies used convenience sampling, with volunteers, typically college students, being included according to availability. Most studies then randomised participants to the experimental and control conditions. Whenever randomisation was not used, participants were typically matched for sociodemographic variables, such as gender. Studies on anxiety and stress commonly used all male samples, considering hormonal issues. However, female participants were the majority among the studies reviewed.

One of the studies (Baños et al., 2012) had a sample composed by older adults, focusing on the use of VR to promote emotional states such as relaxation and joy in this age group. Out of the 61 papers, 12 included a sample containing participants with a mean age above 25 years old. With the exception of the study with older adults, none of the studies had a specific VR paradigm to fit participants’ characteristics.

Study Design

The mixed-design approach, containing both between- (e.g. experimental condition) and within-subject factors (e.g. pre-post exposure), was the most widely employed (24 studies), with participants being randomly allocated to the different experimental conditions in 16 studies. The single-group repeated measures design, with the same participants being exposed to different experimental conditions, was used in 23 of the reviewed papers. Fourteen studies employed between-subject designs, comparing different participants allocated to the experimental conditions; in eight of those, participants were randomly allocated to the conditions.

Elicited Emotions

Altogether, the studies looked at a total of eight different emotions, including both positive and negative moods. They were joy/happiness, fear, sadness, disgust, anger, relaxation, stress and anxiety. Some of the studies did not focus on discrete emotions, investigating the effects of virtual environments on emotional arousal and valence (n = 18). Two studies focused on the induction of complex emotions, as seen in the study of Nazry and Romano (2017), which elicited positive emotions and Chittaro, Sioni, Crescentini, and Fabbro (2017), which elicited death-related emotions. Anxiety, relaxation, fear and joy were the most frequently elicited emotions, appearing on 14, 13, 11 and 7 studies, respectively. Anger, stress and sadness were elicited in few studies, appearing in 6, 6 and 4 studies, respectively. A neutral state was elicited in one study, alongside other emotional states. The least elicited emotion, appearing only in two of the reviewed studies, was disgust. Thirteen studies elicited more than one discrete emotion, typically inducing anxiety/stress together with relaxation or a combination of basic emotions. Disgust and neutral states appeared only in studies eliciting other emotions as well.

Virtual Reality Equipment

Among the equipment used as a mediator between the participants and VEs, the Head-Mounted Display (HMD) with tracking device, which consists of electronic goggles plugged in a computer, was the most used, appearing in 34 of the reviewed papers. The HMD was followed by the Cave Automatic Virtual Environments (CAVE), consisting of a room surrounded by wall-sized screens, which appeared in ten studies. Few studies used a videogame as VE to elicit emotions. The CAVE equipment was found to be used particularly in studies which elicited stress. Videogames were found to be used to elicit emotions such as stress and relaxation, with some studies eliciting anger and positive emotions.

Subjective Measures

Subjective measures were used to measure the efficacy of emotion induction in the VR paradigm, as well as other constructs that were specific to the studies. Several subjective measures were found in the studies analysed in this review, with the scales and questionnaires varying depending on the focused emotion elicited. A group of 26 studies did not use a physiological measure to assess emotion induction, focusing the assessment of the elicited emotion on subjective measures only. The State-Trait Anxiety Inventory (STAI), a 20-item self-reported questionnaire, with each item being scored with a 4-point Likert scale, assesses anxiety both as a transitory state and as a characteristic (trait), was the most commonly used scale, appearing in 16 studies (Spielberger, 1983). The Visual Analogue Scale (VAS) was the second most used instrument, appearing on 12 studies. The scale is used to measure feelings, with participants placing a mark over a line representing their subjective state, with scoring ranging from 0 to 100 (Cline, Herman, Shaw, & Morton, 1992). The Positive and Negative Affect Scale (PANAS), with 20 items divided into 2 factors exploring positive and negative affects (Crawford & Henry, 2004), was present in 8 of the analysed studies. The Self-Assessment Manikin (SAM), used in 8 of the reviewed studies, is a scale that measures valence, arousal and dominance response for each emotion stimuli, with visual aids anchoring the scoring in a 5-point Likert scale (Bradley & Lang, 1994). The Beck Depression Inventory (Beck, Ward, Mendelson, Mock, & Erbaugh, 1961), a 21-item depression questionnaire, comprising 2 main factors (somatic and cognitive symptoms of depression), appeared in 7 studies. Anxiety, fear and stress were commonly assessed subjectively via the State-Trait Anxiety Inventory (STAI), with a smaller number of studies using the Visual Analogue Scale (VAS) and the Self-Assessment Manikin scale (SAM).

Physiological Measures

Physiological measures were used in 35 of the analysed papers. Electrodermal activity was the most commonly employed method (23 studies), followed by heart rate measure (21 studies). Few studies used electromyography and respiration rate measures to complement their findings. Fear and anxiety were usually assessed physiologically by electrodermal activity measures, while joy and relaxation were usually assessed by heart rate measures. Only three of the analysed papers included neuroimaging data collection, namely Functional Magnetic Resonance Imaging (fMRI) and Electroencephalography (EEG).

Effectiveness of VR in Inducing Emotions

Most of the reviewed studies were successful in inducing the selected emotion. In particular, stress and anxiety induction was successful in all studies, with very large effects in a few cases (e.g. Jönsson et al., 2010). Joy, fear and relaxation induction was successful in most studies, although this varied with experimental condition (Gordon, Merchant, Zanbaka, Hodges, & Goolkasian, 2011, relaxation; Toet & van Schaik, 2012, fear).

For a few emotions, such as sadness and anger, effectiveness of induction was lower, with some studies not reporting changes (Felnhofer et al., 2015; Rodríguez, Rey, Clemente, Wrzesien, & Alcañiz, 2015) or only short-lived emotional alterations (Zumbach, Seitz, & Bluemke, 2015, anger). A few studies showed consistently minor changes, regardless of the emotion elicited (Matthias & Beckhaus, 2012). In the study by Jackson, Michon, Geslin, Carignan, and Beaudoin (2015), the elicited emotion was not specific to its experimental condition, with disgust being reported in the anger-inducing condition also.

Studies that explored dimensional aspects of emotion, such as arousal and valence, typically were successful in inducing affective states, with the exceptions being Aymerich-Franch (2010), Murray, Neumann, Moffitt, and Thomas (2016) and Verplaetse and De Smet (2016).

Ethical Issues

According to the reviewed studies, emotion elicitation through VR, similar to other mood induction procedures, appear to induce affective states within the normal range of everyday experience. No study reported adverse effects following VR use, with some excluding participants on the basis of previous negative experiences (e.g. Dibbets & Schulte-Ostermann, 2015). To avoid finishing the testing sessions in a negative mood, studies included positive stimuli after negative emotion elicitation (e.g. Rodríguez et al., 2015; van Strien et al., 2013), other activities (e.g. mindfulness, Cuperus, Laken, van den Hout, & Engelhard, 2016) or a resting period (e.g. Fich et al., 2014). Carry-over effects between experimental conditions were prevented in a few studies by counterbalancing (e.g. Kwon, Powell, & Chalmers, 2013) or randomising conditions (Mousas, Anastasiou, & Spantidi, 2018). Few studies did not provide information on ethics committee approval.

Discussion

In summary, this systematic review compiled 61 studies that used VR as an emotion induction paradigm. In most studies, participants were college students, mixed or within-subject designs were employed, with the most common emotions elicited being anxiety, relaxation, fear and joy. VR equipment used included the Head Mounted Display, the Cave Automatic VE, videogames and computers, with almost all studies (58/61) using subjective measures of emotion and around half of them (35/61) including physiological measures. The findings suggest that VR is consistently effective as a tool to induce emotion in the lab, for a variety of emotional states, assessed by different outcome measures.

The use of college students in a majority of studies is a trend also in other areas of psychological research, and may reduce generalisability of findings, focusing on what has been termed Western, Educated, Industrialized, Rich and Democratic (WEIRD) samples (Henrich, Heine, & Norenzayan, 2010). Further studies in other age, gender, educational achievement and ethnic groups, as well as research conducted in developing countries, is needed to complement current findings, tailoring procedures for the groups studied (e.g. Baños et al., 2012). Regarding study designs, the most common approach was the mixed-design, including both between- and within-subject factors. In the studies reviewed, between-subject factors were typically the experimental conditions, with within-subject factors indicating time (pre- and post-intervention), but in a few cases, different virtual environments were also repeated for the same participants. The latter approach was used in a large number of studies that employed a full within-subject design; although this provides perfect matching of participants across conditions (i.e. the same participants do all of them), carry-over effects, a concern in emotional elicitation studies, need to be avoided. The majority of studies including between-subject factors randomised participants to the experimental conditions, allowing for a tighter control of intervening variables.

In relation to equipment, HMD was the most commonly used, which is in agreement with previous research on clinical use of VR (e.g. motor rehabilitation: Dascal et al., 2017; mental health management: Valmaggia, Latif, Kempton, & Rus-Calafell, 2016). This is possibly due to its great immersive capacity, which is also seen in the CAVE paradigm (Jönsson et al., 2010), albeit with higher costs. The self-reported measures frequently used in emotion elicitation are also consistent with those indicated in reviews exploring clinical use of VR, alongside instruments used for specific conditions (psychosis: Rus-Calafell, Garety, Sason, Craig, & Valmaggia, 2018; addiction: Segawa et al., 2019), suggesting that comparisons between findings obtained with clinical and non-clinical samples, including of effect sizes, is possible. A large proportion of studies relied solely on self-reported measures, which have been criticised for their subjectivity, proneness to demand characteristics and difficulty to access in certain cases (e.g. subtle emotions) (Baumeister, Vohs, & Funder, 2007), suggesting that, when available, future research should include physiological measures, including when investigating specific emotions (for a review, see Kreibig, 2010).

Effectiveness of VR varied according to the emotion investigation, with stress and anxiety being consistently elicited. It is possible that this reflects more refined experimental paradigms for these emotions, given their use in clinical groups (Segawa et al., 2019). Successful elicitation of stress and anxiety is in agreement with the fact that studies using a dimensional approach, i.e. inducing valence or arousal in general, were normally effective, given that both stress and anxiety are less specific emotional responses. In any case, discrete emotions were also successfully elicited, although for sadness and anger this was either reduced or short-lived.

Having stress, relaxation, fear and joy as the most elicited specific emotions in healthy adults is in agreement with the notion that experimental approaches in healthy participants can be used to model clinical phenomena and responses to them. In addition to management of clinical groups (e.g. Gorini et al., 2009), health applications of VR may include promotion of emotional regulation as a prophylactic measure in healthy groups (Montana et al., 2020; Wrzesien et al., 2015). Improvements in the elicitation of these emotions may have a variety of applications, for instance in the entertainment industry (e.g. horror games, comedy films), as self-development tools (relaxation in paradigms of contemplative practices) and as training resources (e.g. habituating professionals exposed to danger and stress to simulations of real-life situations).

Given that VR use with clinical groups has been summarised elsewhere (e.g. Segawa et al., 2019), this article focused on work with healthy participants. Although this limits clinical applicability of findings, it emphasises the reactivity of non-clinical groups to emotion using VR, facilitating research and everyday life applications. Additionally, although a meta-analytic approach could have been attempted, considerable heterogeneity of methodologies prevented such an analysis; with the expansion of the literature in this field, a critical mass of papers in each subtopic may allow combination of effect sizes.

Finally, although the reviewed studies did not explore the mechanisms behind emotion elicitation through VR, findings allow us to speculate on the reasons behind the effectiveness of VR in causing emotional experiences. In particular, studies that integrated subjective and physiological measures suggest that VR is capable of inducing emotions at a bodily level. Coupled with the preference for equipment that leads to higher immersion, and the known influence of presence in emotional responses in VEs (Price, Mehta, Tone, & Anderson, 2011), it is likely that the mechanisms behind VR emotional induction are linked to the integration of sensory data from multiple sources, including visceral, proprioceptive, visual and auditory information. If correct, that would be in agreement with traditional perspectives in Psychology (James, 1894) and current trends that emphasise the relevance of embodiment in cognitive and affective processes (Kiverstein & Miller, 2015).