Keywords

1 Introduction

Emotions are complex mental and physiological states that can be associated with a wide array of thoughts, feelings, and behaviors and can be experienced as positive or negative depending on the individual and the situation. Facial expressions refer to a set of specific movements of the small muscles in the face that can be used to identify or infer a person’s particular emotional state, such as happiness, anger, or sadness. Although there is ongoing debate among scientists about whether animals possess emotions, a growing body of research indicates that many species do exhibit behaviors and physiological responses that are indicative of various emotional states [1]. Nonetheless, it is widely recognized that human beings have evolved more complex and sophisticated mechanisms for the expression and control of emotions in comparison to other animals [2]. It is not uncommon for individuals to lose control of their emotions, particularly in situations that trigger intense emotional responses. When this occurs, the individual may display behaviors that are inconsistent with their usual personality and values and may experience negative consequences, such as damaged relationships, effective decision-making, and reduced well-being. Thus, it is imperative for individuals to exercise appropriate control over their emotions in order to mitigate the potential negative impact.

Gaming is a popular form of entertainment and has the potential to evoke a wide range of emotions in players [3, 4]. However, humans can experience a variety of emotions without being fully aware of them, which can impact their ability to control their emotions in a healthy way. This can be particularly problematic for players who struggle with uncontrolled emotions, such as anger or anxiety, and can negatively affect their gaming experience. Therefore, the problem that this paper aims to address is the need for a game that can adapt to a player’s emotions and provide a more tailored, enjoyable, and beneficial experience.

As technology has advanced, there has been a growing interest in using machine learning and computer vision to enhance gaming experiences. The availability of data on people’s faces and their corresponding emotions has presented an opportunity to use this information to improve gaming experiences. By combining a game with a computer vision module, it is possible to read a player’s emotions and adjust the game’s difficulty in real-time, providing a more personalized and engaging experience.

The significance of this project is that it has the potential to help individuals develop emotional control and self-awareness, which can lead to healthier emotional responses and stronger relationships. People who struggle with uncontrolled emotions, such as those with anxiety disorders or short tempers, could benefit from playing a game that helps them practice managing their emotions. Additionally, this game provides a fun and engaging way for players to assess their emotional control and awareness. The objective of this paper is to describe the design and development of a game that uses computer vision to adapt to a player’s emotions and provide a more personalized gaming experience. By doing so, this paper hopes to contribute to the development of games that promote emotional intelligence and control.

2 Related Work

There has been a growing interest in the use of machine learning and computer vision to recognize emotions in recent years. This has led to the development of many commercial applications that use emotion recognition to improve user experience, such as chatbots, virtual assistants, and personalized advertising. However, the use of emotion recognition in gaming is still a relatively new and underexplored area of research. By creating a game that uses computer vision to recognize a player’s emotions, this project aims to contribute to this emerging field.

The authors of the article [5] used wearable biofeedback sensors to measure players’ peripheral physiological signals, such as heart rate and skin conductance, and analyzed several physiological indices to determine their correlations with anxiety. They then used these data to infer the player’s probable anxiety level and adjust the game difficulty level in real-time based on the emotion state.

This paper [6] describes the implementation of a dynamic game difficulty adjustment (DDA) method in Tetris using Active Shape Model (ASM) and Hidden Markov Model (HMM) to recognize players’ emotion states from a camera feed. They utilized a Kalman filter to dynamically detect players’ experience and adjust the game speed accordingly. Experimental results showed that the DDA method improved players’ game experience.

The authors created a horror game, Caroline, [7] that uses the player’s biometric data to adapt the difficulty level based on their stress levels. They explored the impact of this approach on players’ cognitive, emotional, performative, and decision-making challenges, as well as their intrinsic motivation, and compared it to the base game without any DDA. Their results showed that players felt more motivated when the gameplay was adjusted according to their heart rate, and the DDA only affected the decision-making challenge [8, 9].

3 Methodology

We proposed a solution that involves the creation of an emotion-adaptive game that utilizes real-time emotion recognition program to capture and store the player’s emotional state in a database. This information is then utilized by the game to dynamically adjust the game’s difficulty level as necessary. Specifically, we have chosen to develop a space shooter game to showcase this emotion-adaptive functionality, in which the player maneuvers a spacecraft laterally and fires at incoming enemies [10].

Our study explores two approaches for setting the difficulty level in an emotion-adaptive game. The first approach involves giving the player an easier difficulty when experiencing negative emotions such as sadness, anger, or fear, with the expectation that this will improve their mood. However, we found this approach to be ineffective, as it can lead to a stagnation in the player’s emotional state, with the player being incentivized to show negative emotions to achieve a higher score [11, 12].

The second approach involves penalizing the player for showing negative emotions by increasing the difficulty level, which challenges the player to control their emotions. We believe this approach to be more effective in helping players to regulate their emotions [13]. Our proposed game will include three different modes, each designed to elicit a specific emotional response from the player. Overall, our goal is to create an engaging and constructive gaming experience that helps players to manage their emotions in a healthy way [14].

We propose three types of modes for the respective emotions.

  1. 1.

    Easy (A)—happy and neutral.

  2. 2.

    Medium (B)—angry and sad.

  3. 3.

    Hard (C)—fear and surprise.

There are two main modules involved in this game, and they are:

  1. 1.

    Emotion prediction module: The emotion predictor module is designed to recognize faces in real-time, crop and resize the image, convert it to grayscale, and predict the associated emotion using computer vision techniques. The prediction model is trained beforehand, with the resulting model being used to predict human emotions and store the data in a database. Real-time images are captured using OpenCV with Haar cascade classifier, and the model is used to accurately predict emotions [15, 16].

  2. 2.

    Game module: In conjunction with the emotion predictor module, the game module operates parallelly with the use of threads. The game module accesses the emotion data stored in the database by the predictor and adjusts the difficulty level accordingly [17]. To ensure an optimal gaming experience, the player begins with a relatively easy difficulty level that gradually increases as the game progresses.

The system can be represented as a block diagram as shown in Fig. 1, where the player inputs keypresses through a keyboard, which are then transmitted to the game. The game processes these inputs and moves the player’s character accordingly, with the output being displayed on a monitor visible to the player. Meanwhile, a camera captures the player’s facial expressions in real-time and transmits them to an emotion prediction model. The model generates emotion predictions, which are then stored in a database [18]. The game reads the database and adjusts its difficulty level based on the frequency of emotions detected.

Fig. 1
A block diagram of a game records the emotion of a player via the keypresses on a keyboard and the emotions through a camera. The image captured by the camera is then sent to a database with a record of emotion frequencies that determines the difficulty of the game and sets it accordingly.

Block diagram of the proposed game

4 Implementation Details

The process of predicting emotions involves using a Convolutional Neural Network (CNN) model. An appropriate architecture is selected and hyperparameters are set before training the model. Once trained, the model is evaluated on both the training and testing datasets. To improve accuracy, the methods are fine-tuned. After obtaining a high-performing model, it is saved. A system is then implemented to capture facial images of a person and send them to the saved model in real-time to predict emotions.

To achieve this, we first select a dataset for training the model. For this purpose, we chose the Kaggle FER 2013 dataset, which consists of grayscale images of people’s faces labeled with six different emotions: angry, happy, neutral, surprise, fear, and disgust. The dataset contains 35,887 images with dimensions of 48 × 48 pixels.

We utilized a CNN model that was implemented using the Keras API in TensorFlow. The model consisted of repeated combinations of Convolutional, Batch Normalization, and Max pooling layers, with an activation function of ReLU. Our initial training on this model yielded a prediction accuracy of only around 63%. Upon further analysis, we discovered issues within the dataset that impacted our accuracy [19, 20]. Firstly, the dataset only contained grayscale images with dimensions of 48 × 48 pixels. Training on grayscale images with only a single channel of pixel intensity resulted in less detailed information compared to training on the more commonly available three-channel RGB images. Additionally, while larger image sizes generally provide more precise features, training on larger images becomes more difficult when there is a large amount of data to process. Furthermore, predicting emotions on large images in real-time can become problematic. Therefore, in this particular task, 48 × 48 pixel images were deemed sufficient [21].

We collected additional data and used data augmentation techniques to improve the dataset. Further, we reduced the skewed data by removing “disgust” emotion which is less relevant. Improving upon all these factors gave us about 74% of accuracy. The fact that the accuracy is still not near perfect is because a significant amount of detail is lost when images are 48 × 48 resolution. Moreover, some emotions had different interpretations in dataset and the model predicted reasonably, meaning that the model is good to be used [22].

The space shooter game is implemented in Python with Pygame module. The main game loop starts, and it also starts a thread in parallel that reads emotions. The game initially starts in the A mode. It generates some enemies for A mode, and the emotion predictor reads all the emotion data. When player defeats all the enemies for A mode, new wave of enemies is generated, and the game difficulty mode is now set based on the emotions displayed by the player. The mode can now be A, B, or C based on emotions shown. Every time a wave of enemies is approaching, the predictor records the emotion information, and the game reads the emotion information when the wave of enemies is zero and creates a new wave of enemies based on the recorded emotions [23].

Since the emotions fear and surprise are not that frequent, they are given more weightage than other emotions. Sad and fear emotions are given medium weight. Normal and happy emotions are given least weights as they are the emotions that occur frequently. The emotion data are the number of frames the model has predicted the user showing a particular emotion, while a wave is going on as shown in Fig. 2.

Fig. 2
A block diagram has the following flow, data, preprocess, choose model, train, predict with parameter tuning, evaluate with improve accuracy, and save model with a program that feeds images in real-time to the model.

Machine learning training process flowchart

5 Result Analysis

The locally saved CNN model is utilized in conjunction with a camera module, which implements OpenCV, PIL, and Haar cascade classifier to capture facial images. The captured image is cropped and resized and subsequently fed into the emotion predictor for the generation of real-time emotion data. The generated data are then stored in an SQLite database every 30 frames. The game and predictor operate concurrently in separate threads. The game retrieves the emotion data from the database and resets the emotion counts to 0 upon each retrieval, allowing for the loading of new data by the emotion predictor. Our emotion prediction when ran in parallel with game gave us around 55 frames per second on a computer with Intel Core-i5 2400 CPU and 4 GB DDR3 RAM. Some of the images of the game are shown.

Our game successfully manages to utilize real-time emotions of the player and use it to set the game difficulty as shown in Figs. 3 and 4.

Fig. 3
2 sets of screenshots of the shooting and the facial emotion prediction windows. The shooting windows have the video game with lives 17 and 14, and mode A, respectively. The facial prediction windows have the emotions of the guy as neutral and angry, respectively.

Gameplay, modes set to mode A in both the images, and the player showing neutral and angry emotions

Fig. 4
2 sets of screenshots of the shooting and the facial emotion prediction windows. The shooting windows have the video game with lives 13 and 9, and modes B and C, respectively. The facial prediction windows have the emotions of the guy as surprised.

Gameplay, modes set to mode B in the first and mode C in the second image, and the player showing surprise emotion

6 Conclusion

In this paper, we discussed the design and implementation of an Emotion-Adaptive Space Shooter Game, which incorporates real-time emotion prediction to optimize the game’s difficulty level. Our results show that our proposed game system can accurately predict the player’s emotional state and adjust the game’s difficulty level accordingly, resulting in a more engaging and challenging user experience. Moreover, our system’s approach of penalizing negative emotions rather than comforting them showed significant improvement in emotion regulation and control. Overall, our Emotion-Adaptive Space Shooter Game provides a promising avenue for future research in the field of game design, particularly in developing more personalized and emotionally intelligent gaming experiences.

Furthermore, the potential benefits of this project extend beyond the gaming industry. Emotional control and self-awareness are important skills in many aspects of life, including relationships, work, and personal well-being. By providing a fun and engaging way for players to practice managing their emotions, this game has the potential to promote emotional intelligence and self-awareness beyond the world of gaming.

In summary, the motivation behind this project is to leverage advancements in technology to create a game that adapts to a player’s emotions and provides a more personalized and engaging experience. This has the potential to promote emotional control and self-awareness in players, while also contributing to the emerging field of emotion recognition in gaming.