Keywords

1 Introduction

Music is a source of inspiration, healing and expression of feelings. It evolved gradually over time and it plays an important role in all cultures. Nowadays, music is not just a symbol of art and culture, it is used also as a method of therapy in different areas of medicine.

It was proven that music has effects on physiology and symptoms. If carefully selected, music can reduce stress and anxiety, offer distraction from pain, increase relaxation and comfort level, boost the energy, improve the mood of the patients and enhance clinical performance. For example, music can improve mood in oncology patients, reduce anxiety and blood pressure for surgical patients, reduce depression, decrease patients’ perception of pain and classical music increase weight gain for premature infants [1].

Music supported therapy is used for patients who suffered a stroke because it helps them in cognitive rehabilitation [2]. In addition, music therapy can help people who have gone through traumatic events to move on easily [3].

Listening to music is a complex process, which involves many motor, cognitive, perceptive and emotional components that work together to create our subjective experience of music [2]. Our mood and emotional state are greatly influenced by the music we listen to, and we can steer away from negative emotions by means of appropriate music.

The user’s emotions are the key to developing a music recommendation system. Currently, recommendation systems are among the most popular applications, in various fields, such as engineering, research, finance, sales and to industries such as applications that recommend music or movies. The main purpose of a recommendation systems is to predict what users might be interested in [4]. The recommendation can be made according to the user’s preferences, recommending a ranking of the most suitable elements [5].

Recommendation systems can be classified based on three main models: the collaborative filtering model, the content-based model and the hybrid model, which is a combination of the two [6]. There is also the knowledge-based model and the demographic recommender system [5].

This paper presents the implementation and development of a prototype music recommendation system, based on the correlation of the user’s emotional states. The system uses tags or keywords assigned to songs and the correlations between them. An approach similar to collaborative filtering is suitable for a recommendation algorithm that uses correlations between emotions, because this model allows making recommendations for new users. This kind of recommendations are based on similarities between users and between recommended elements. The implementation uses the Python programming language and the Google Colab environment.

The remainder of the paper is organized as follows: Sect. 2 describes the implementation of the prototype, Sect. 3 shows user scenarios and preliminary results, while Sect. 4 concludes the paper and states sets future work objectives.

2 Work Description

2.1 Tools and Objectives

The Python programming language was used to develop the prototype music recommendation system, in the Google Colab environment used.

The objective is to develop a system that recommends music based on the correlations of the nine emotions presented in the data set. The database is created through annotations, collected using the GEMS scale (Geneva Emotional Music Scale). This scale contains nine tags of emotions: amazement, solemnity, tenderness, nostalgia, calmness, power, joyful activation, tension and sadness. Each of the 400 songs from the data set is labeled according to these emotions.

The system computes the similarities between emotions, which are interpreted as tags, for making recommendations. The first goal of the prototype is to make recommendations of emotions, similar to the one introduced by the user. In the second phase, music recommendations are made based on the similarities between these emotions.

2.2 Implementation

The input data is taken from the data set and processed in the first phase, within the correlation algorithm.

The correlation algorithm receives the emotions from the data set as input, which are interpreted as tags, and computes the correlations.

These correlations are inputs to the feelings (emotions) recommendation algorithm. This algorithm receives a feeling as input, and it generates two other similar feelings as a result. The obtained results are implemented in the final algorithm that will recommend songs based on the similarities between emotions, given by the calculated correlations (Fig. 1).

Fig. 1.
figure 1

Conceptual diagram of the prototype music recommendation system

The data set in [8] was used to implement the recommendation algorithm. This system is based on 400 excerpts from songs, lasting one minute, divided into four musical genres (rock, pop, classical and electronic). Sorting or recommending songs by genre was not a priority in achieving the proposed goal. The developed algorithm focuses on the idea of making recommendations based on emotions.

During the implementation of the system, emphasis was placed on the emotions and the evaluations of the songs, not on the popularity of the songs, their level of appreciation or on the demographic information of the participants.

The data was divided into two subsets. The dataset used to develop the system contained information about the nine tags of emotions, the musical genre, the id of each song and its name, added later. The file with the 400 songs that were evaluated was also attached to the data. This file with songs is the second reason why this dataset was chosen, because it is desirable that once the songs have been recommended according to the mood, the user can listen to them.

In the dataset, the evaluation of the songs is realized by a unary evaluation system. If a participant felt a certain emotion when he was listening to a song, that state was assigned to the value 1. In addition, if a certain emotion was not felt, the participant did not make any specification. Thus, unspecified fields were initialized with the value 0, to allow data processing and analysis. Taking this to account, a binary matrix is obtained.

The adnotations were performed by 64 participants, who listened to the songs. Each of them could select a maximum of 3 tags (emotions that were experienced when listening to the song).

The purpose of the first algorithm is to make recommendations based on the emotions contained in the data set. In the beginning, a correlation was made between tags. This correlation is an association between tags that are similar. The correlation between two tags returns a value between 0 and 1, which represents the probability of finding the two labels belonging to the same category or list. This suggests the similarity or the semantic correlation between the emotions from the dataset. This type of correlation is necessary, since the labels are categorical variables, as opposed to continuous ones. The correlations between two labels is computed as a probability:

$$p = \frac{No. \;of\;favorable\;cases}{{No.\; of\;possible\;cases}}$$
(1)

Where p is the number of cases where variables a and b are labeled together, against the number of cases where just a or just b are labeled in that category. There are three possible cases: (1) both variables are labeled in a song (a_and_b); (2) only a is labeled (a_not_b); (3) only b is labeled (b_not_a). Variables a and b are initialized with the rows of the dataset where a or b is equal to 1 (meaning that the emotion was labeled). The dataset is parsed three times, to count all three possible cases. Finally, the number of favorable and possible cases are computed and the correlation between a and b is computed, using Eq. (1).

With these values, the correlation matrix was formed (Fig. 2). The matrix shows a correlation of 1 (maximum) on the main diagonal and is symmetrical with respect to it.

The algorithm explained above was integrated into another algorithm that recommends music based on the correlation between emotions from the data set. The reason for choosing this kind of approach is that a certain song can be defined by several tags or can convey more emotions. This is more relevant if a data set with more tags is used. However, the associations are also suitable for the data set used and the functionality of the algorithm is highlighted.

Fig. 2.
figure 2

The correlation matrix between the emotions from the data set

3 User Scenario and Preliminary Results

The operating principle of the correlation system is illustrated in Fig. 3: when the user chooses an emotion, the system will recommend two similar emotions, which have highest correlation with the emotion given as input. This suggests that the three previously mentioned emotions can be integrated into a certain category and they frequently occur together as tags for the same songs in the data set.

Fig. 3.
figure 3

Principle of operation for the correlation system

For example, if the user chooses to give as input the emotion “calmness”, the returned emotions will be “tenderness” and “nostalgia”. The semantic bond between these three emotions is clearly visible. This type of recommendation can help the user define and better understand his mood, and then choose a song according to his current emotional state or a song compatible to a new emotional state he wants to experience.

In another possible user scenario, the input emotion is “amazement” and the received emotions are “joyful activation” and “power”. These three emotions are also connected and can be placed in the same category.

Next, the music recommendation algorithm receives as input data, the output data of the algorithm that generates correlations.

The user has the possibility to choose a certain emotion from the list of emotions in the data set used and two similar emotions will be returned to him, which have the greatest correlations with the emotion given as input. After the user received these recommendations for similar emotions, he will be given song recommendations based on the first emotion received as output. The recommendations are made this way because there are certain similarities between the emotion given by the user and the first recommended emotion, as they are often tagged together in the dataset. In the end, the user can listen to the chosen song.

The principle of operation for the music recommendation system is visible in the usage scenario illustrated in the Fig. 4.

Fig. 4.
figure 4

Usage scenario for the music recommendation system

4 Conclusions

Emotions govern our daily lives, whether we want to accept this or not. The emotional state we are in influences the quality of our work and relationships, as well as our mental and physical state. One of the handiest tools in quickly changing and improving our emotional state is music.

The proposed system can make music recommendations based on the correlations of the emotional states from the data set. This approach allows the users to define and understand better their mood and to listen to music from a certain category of songs. Also, the users can choose the emotion they want to have and listen to music according to the desired mood.

This system is a prototype and was implemented for small scale integration, but large scale optimization can be achieved, provided the access to larger data sets. Also, testing the system and collecting user feedback is crucial to further development. The long-term goal is to include the system into a mobile app designed for mood improvement, with music and visual effects as the main instruments of the mood changing process. A wider range of emotions and tags would significantly increase the precision of the system.