Keywords

1 Introduction

The significance of emotions in our lives has spurred researchers to develop systems that can detect and respond to users’ emotional states. These systems have the potential to significantly improve our well-being by enabling mental health symptom recognition and care interventions [17], adapting home functionalities for improved well-being [29], or adjusting car characteristics to reduce stress impact during driving [16]. Emotion Recognition, a key topic in Affective Computing, is fundamental for building such intelligent technologies.

Emotion Recognition involves detecting emotional states from visual, audio, and physiological data, which requires training algorithms with ground-truth data. Ground-truth data is collected and annotated through affective self-report, essential for obtaining valid and accurate datasets for emotion recognition systems [3]. The instruments used can influence the accuracy of collected data and its labels in assessing emotional states. Various tools are available to measure affect, ranging from pen-and-paper long questionnaires. However, there is a growing need in the research community for a valid, generic, quick-to-use, and intuitive instrument for self-reporting affect [5, 11, 15].

2 Background and Related Work

Although there is no universally accepted definition of emotion, researchers concur that emotional states involve coordinated changes across multiple components, including a physiological component, a behavioral or expressive component, and a subjective feeling component [13, 19]. Although sensors can implicitly measure the physiological component and facial expression recognition can assess the expressive component to some extent, accessing the subjective feeling component requires explicit subjective self-report. Russel refers to this subjective emotional component as “core affect” - a consciously available neurophysiological state described as a point in a valence (pleasure-displeasure) and arousal (sleepy-activated) space [26]. Valence and arousal are also fundamental to a prominent dimensional model used to classify emotions, known as the circumplex model of affect [24], which is widely utilized in fields such as psychology [8] and HCI [11]. Furthermore, these dimensions are commonly employed to rate emotional stimuli sets [7, 9, 10] and annotate ground-truth datasets [3, 31].

Arousal, also known as energy, activation, or intensity, is widely recognized as the intensity of the experienced emotion. On the other hand, valence, also known as pleasantness, refers to the hedonic quality that reflects an emotion’s degree of positivity or negativity. Researchers in the affective sciences and related disciplines commonly use the terms arousal and valence. However, these terms do not typically carry the same meaning in everyday English, which can cause confusion. For instance, valence is often associated with the ability of atoms to combine with others, while arousal is often associated with sexual excitement. This disparity in usage may result in users misunderstanding the intended meaning of these terms when encountered in interfaces for self-assessing emotional states. Therefore, it is crucial to investigate how users perceive and interpret these concepts concerning emotional states.

Various instruments enable individuals to self-report their affect using the dimensions of arousal and valence, ranging from traditional pen-and-paper psychology instruments like the PANAS scales [30] to modern digital interfaces. For example, the-Assessment Manikin (SAM) (Fig. 1 -A) [4] is a widely recognized pictorial scale that portrays the dimensions of the Pleasure-Arousal-Dominance (PAD) model [27] using manikin figures arranged in three rows. Users can select the level that best represents their perception of each dimension using a nine-point Likert scale accompanying each row. SAM has been extensively validated and employed. However, researchers have identified some limitations, including the need for clear instructions and compliance with the SAM usage protocol [2, 5]. Another drawback is the potential misinterpretation of arousal, represented by an “explosion” in the stomach area, which may result in incorrect responses [28]. The Affect Grid (Fig. 1 -B) [27] is a paper-based, single-item scale developed in 1989 for quick and easy assessment of affect through valence and arousal, based on Russell’s Circumplex Model [24]. It uses a 9x9 table as its interface, where the center cell represents a neutral feeling, the vertical dimension means arousal and the horizontal dimension represents valence. Participants mark the grid to indicate their emotions. The Affective Slider (AS) (Fig. 1 -C) [2] is a digital scale composed of two slider controls that measure valence and arousal. Emoticons are placed at the edges of the sliders to represent bipolar affective states, and two opposed triangles underneath each slider serve as visual cues for intensity. The Photographic Affect Meter (PAM) (Fig. 1 -D) [23] is designed for mobile, with a user interface based on a 4x4 grid containing 16 randomly selected photographs that represent a diversity of emotions. Users choose the image that best captures their feelings at the moment, and each photo maps to a score in Russell’s Circumplex. The Circumplex Affect Assessment Tool (CAAT) (Fig. 1 -E) [6] is a widget for assessments of emotional experiences. It displays 25 selectable emotion nodes, each with its feeling word and color, arranged in a layout based on Plutchik’s model [22]. The AffectButton (Fig. 1 -F) [5] is a button displaying a changing face as the user’s mouse pointer moves inside. Users select the facial expression that represents their feelings. The x and y coordinates within the button define values on the PAD model [21].

3 Method

We conducted five design workshops where participants were asked to brainstorm designs of a graphical interface for reporting emotions using the arousal and valence dimensions. The main objective of these sessions was to generate ideas and solutions for effectively representing emotion-related concepts and enhancing the design of affective self-report interfaces. A total of 29 participants partook in the workshops, with Workshop 1 (WS1) having 6 participants, WS2 having 4 participants, WS3 having 7 participants, WS4 having 6 participants, and WS5 having 6 participants. The participants (22 males and seven females, averaging 33 years old) were recruited via social media and word-of-mouth, mainly targeting designers and software engineers. The study took place in Portugal.

The workshops, each lasting approximately 2.5 h, commenced with introductions, consent forms, and a brief demographic questionnaire. Subsequently, we provided instruction on valence and arousal and the Circumplex Model [24]. Participants were then prompted to brainstorm and sketch potential design options for a digital interface intended for self-reporting emotional states, drawing upon the concepts they had just learned. We told participants that the interfaces should ideally be generic and cross-platform, but they were free to consider specific platforms if they preferred, as our primary objective was to generate many ideas and observe how valence and arousal would be depicted.

We utilized thematic analysis as our qualitative data analysis approach to identify and organize recurring themes in the data. Due to space limitations, we summarize the coding process, highlighting the key codes and categories. Our research team followed a systematic coding process that involved conducting open coding to generate an initial list of codes, which were then refined and organized into categories. Consistency and validity were ensured through team-based reviews and discussions of the codes and categories. Based on our analysis, we identified several key codes and categories that emerged from the data. Through the investigation, several patterns and themes emerged, and this paper presents the study of the seven most relevant themes and their design implications.

Fig. 1.
figure 1

Images of the interfaces of A- The Self-Assessment Manikin; B- Affect Grid; C- The Affective Slider; D- Photographic Affect Metter; E- Circumplex Affect Assessment Tool; F- AffectButton.

4 Results

4.1 Arousal and Shape

Participants generally associated high arousal levels with irregular and shaky lines, while they connected low arousal levels with smooth lines. For example, P5.2 (participant 2 from Workshop 5) correlated arousal with the lines in electrocardiogram charts. This association between shape and emotions, where high-arousal emotions like anger are often depicted with spikes, and low-arousal emotions like sadness are often represented by round shapes, is also commonly seen in everyday design.

4.2 Arousal and Movement

During the workshops, participants frequently discussed how to depict arousal, often starting with high arousal levels, and linking them to energetic movements. For example, participant P3.5 designed a wristwatch (Fig. 2) that required users to shake their wrists to indicate their level of arousal, with more vigorous movements indicating higher arousal levels.

4.3 Arousal and Body-Related Concepts

For instance, P2.1 suggested a slider for arousal that changes the body’s position as it moves, while P3.1 mentioned relating arousal to a fast or slow heartbeat. The connection between arousal (also referred to as body activation) and the body is well-established, as arousal has been linked to activity in the sympathetic nervous system [24] and is relevant to the perception and identification of emotions [20]. In addition, Interoception, which refers to the perception of sensations from within the body and how they relate to emotional states, has positively contributed to health and well-being [25] positively. Users need to be aware that investigating their bodily signals can help gauge their arousal levels during self-reporting of affect. Therefore, the interface should convey that arousal is experienced intrinsically in the body and encourage users to pay attention to their bodily sensations when measuring arousal.

Fig. 2.
figure 2

The two images depict sketches of one of the proposed solutions. In this design, the user initiates the process by verbally expressing their emotion, which is interpreted as valence. Next, the user adds the level of arousal by shaking their arm.

4.4 Valence and Facial Emojis

Participants in the debate consistently considered facial features and emojis as potential representations of valence, which may be due to the widespread use of emojis in messaging apps to convey emotions. For example, P2.1 stated, “We can use smileys, which are very popular and commonly used in WhatsApp.” P3.7 also emphasized the significance of eyebrows in conveying emotional expressions in Comics. These considerations are supported by research indicating valence measures correlate with facial muscles involved in emotional expression [24].

4.5 Valence, Arousal, and Color Properties

In the workshops, participants discussed using colors to represent arousal or valence. Consensus was challenging due to cultural differences, but some groups agreed on specific color properties. Saturation was chosen to represent valence, while brightness was selected for arousal. Participants associated negative valence with “dark” and “lifeless” colors, and positive valence with bright, vivid colors, aligning with artistic and design practices [1]. For instance, P5.5 remarked “If it is colorless, normally a person associates it with negative state” referring to dark colors and shades of grey. On the other hand, P2.1 mentioned “I also like the idea of going from the darkest color to the lighter color” referring to the transition from negative to positive valence.

4.6 The Relationship Between Arousal and Valence

Most participants seemed to believe valence was the emotion per se and that arousal was its intensity (i.e., arousal was a property of valence). For instance, Fig. 3 shows an interface where the buttons “+” and “−” control the intensity of each face (meant to denote valence).

Fig. 3.
figure 3

One of the proposed solutions: the user begins by selecting a face that represents their desired valence, then adjusting arousal levels using the “+” and “−” buttons below.

Similarly, Fig. 2 shows a design where the interface first asks the user to evaluate the valence by responding to the question “How happy?” and then rate the intensity of that response (i.e., the intensity of valence) by shaking the wrist. Despite some opposition [18], scientific literature mainly states that valence and arousal are separate dimensions equally important when related to emotion. Thus, the elements in the interface that represent them should be at the same hierarchical level. As such, the two parts should be visually similar in shape, size, color, proximity, and direction (Gestalt principle of similarity [12]). However, we believe that if users think first about arousal, they will think about it independently and not dependent upon valence. We can achieve this by placing arousal first in the interface, either horizontally or vertically.

4.7 Feedback and Introspection

Participants consistently emphasized the importance of feedback to confirm the accuracy of their input in terms of arousal and valence values aligning with their intended emotion label. P4.1 explicitly stated this need: “as you navigated through different zones, there would be pop-ups with keywords associated with feelings to help you understand what that feeling was like, to see if it corresponded to what you were feeling.” Building on this concept, participants in WS4 devised solutions where users could view the emotion label simultaneously while using the self-assessment interface element.

An integral aspect of interface design is providing timely feedback to keep users informed [14]. However, some participants expressed concerns about immediate feedback, such as displaying the emotion label, as it could impede introspection and adversely impact the process of emotional self-assessment. For instance, P1.6 remarked, “I liked the slider better. At the end, you could even submit, and it would give you a final face, that would be OK. But not at the moment you are still trying to figure it out” and “I think that if we want to make an assessment, it is better not to see the final result.” In addition, this participant highlighted how seeing an emotion label while selecting arousal and valence values could influence the final answer, potentially leading users to search for a label rather than focusing on separate variables. P2.1 also shared similar views, stating, “I think it is cool that you give people control of the two dimensions separately, but you can see the joining of the two to what it corresponds to. But she has to work on both separately, contributing to that awareness. It seems to me that this is more important.” This participant emphasized the significance of users reflecting on arousal and valence separately before receiving feedback (i.e., an emotion label corresponding to the chosen values). Affective self-reports need introspection, and this psychological process should be considered during the design phase. Displaying a label immediately as users select arousal and valence values might hinder introspection, resulting in less focused and truthful choices based on the two dimensions. Furthermore, it is worth noting that there is no scientifically proven mapping of arousal and valence to specific emotion labels. Therefore, assessing arousal and valence values directly is imperative, as inferring values from a label may not be accurate. Additionally, separating an emotion into two variables, arousal, and valence, could enhance the final assessment, as users need to contemplate various aspects of emotion.

5 Design Implications

5.1 Representing Arousal Through Shape

Based on our findings, it is recommended that when representing arousal in an interface through shapes, they should transition from soft to shaky as the intensity increases. For instance, smooth and delicate shapes should indicate low arousal levels, while wobbly and thick shapes should be used to depict higher levels of this dimension.

5.2 Representing Arousal Through Movement

According to our research, vigorous movements could indicate the expression of higher arousal levels. Expressing arousal by shaking a watch or other objects could accommodate a broader range of users, including those with visual impairments.

5.3 Representing Arousal Through Body-Related Concepts

Users need to recognize that assessing arousal involves interpreting the body’s signals. Therefore, the interface should emphasize that arousal is an intrinsic sensation that originates from the viscera. As a result, incorporating body-related images and metaphors can be beneficial for conveying arousal in the interface. Alternatively, audible instructions, such as those used in guided meditations, may also be helpful.

5.4 Representing Valence Through Facial Emojis

Our research suggests that using facial emojis and facial features to represent valence is a good choice. However, it is important to consider that different emojis can carry different meanings in different cultures.

5.5 Representing Valence and Arousal Through Color Properties

Cultural barriers can make establishing relationships between colors and emotional properties challenging. However, our findings suggest that saturation and brightness may be worth exploring as a way to indicate varying levels of valence or arousal. Dark and lifeless colors could represent negative values at one end of the spectrum, while bright and vivid colors could be set at the opposite end, representing positive values.

5.6 Ask for Arousal First

During our study, we discovered a common misconception: many people mistakenly perceive valence as the emotion itself, and arousal as its intensity, i.e., a property of valence. To address this issue, the interface layout must communicate to users that valence and arousal are independent dimensions. We propose two steps to achieve this. Firstly, both dimensions should be visually represented similarly, with equal size and parallel positioning, following the Gestalt Principle of Similarity, to convey equal importance. Secondly, we suggest presenting arousal information to users first, followed by valence, either horizontally or vertically, such as on top or the left. This sequence aims to prompt users to consider arousal as a separate dimension, independent of valence, and prevent them from conflating arousal as a property of valence.

5.7 Help Users Confirm Their Choices but Preserve Introspection

Assessing emotional states is a deeply personal and introspective experience. However, providing users immediate and explicit feedback can hinder introspection and bias responses. To address this dilemma, there are several solutions. One approach is offering abstract or subjective feedback, such as adjusting the screen’s brightness or incorporating abstract designs. Another option is to implement a “submit” button that only displays objective feedback, such as an emotion label after the user completes the assessment by clicking on it. Additionally, strategically placing prototypical emotion labels on the interface can help guide users during the assessment process, providing further assistance without compromising their introspective experience.

6 Conclusion and Future Work

To correctly understand emotional states, gathering data from several components, including the conscious subjective feeling, which is only graspable through self-report, is necessary. Our findings, derived from five design workshops focused on investigating how individuals comprehend emotion-related concepts, provide insight into the representation of arousal and valence dimensions, the interface layout, and feedback. Furthermore, by converting our results into design implications, we offer potential considerations for developing graphical interfaces that facilitate emotion reporting. Our future endeavors entail conducting a replicated study with a sample that is gender-balanced and encompasses multiple countries. This approach aims to yield more robust and reliable results. Subsequently, we will proceed to develop prototypes to assess the effectiveness of the design implications.