FEEL: a French Expanded Emotion Lexicon

Abdaoui, Amine; Azé, Jérôme; Bringay, Sandra; Poncelet, Pascal

doi:10.1007/s10579-016-9364-5

FEEL: a French Expanded Emotion Lexicon

Original Paper
Published: 22 June 2016

Volume 51, pages 833–855, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Language Resources and Evaluation Aims and scope Submit manuscript

FEEL: a French Expanded Emotion Lexicon

Download PDF

Amine Abdaoui¹,
Jérôme Azé¹,
Sandra Bringay^1,2 &
…
Pascal Poncelet¹

1346 Accesses
46 Citations
1 Altmetric
Explore all metrics

Abstract

Sentiment analysis allows the semantic evaluation of pieces of text according to the expressed sentiments and opinions. While considerable attention has been given to the polarity (positive, negative) of English words, only few studies were interested in the conveyed emotions (joy, anger, surprise, sadness, etc.) especially in other languages. In this paper, we present the elaboration and the evaluation of a new French lexicon considering both polarity and emotion. The elaboration method is based on the semi-automatic translation and expansion to synonyms of the English NRC Word Emotion Association Lexicon (NRC-EmoLex). First, online translators have been automatically queried in order to create a first version of our new French Expanded Emotion Lexicon (FEEL). Then, a human professional translator manually validated the automatically obtained entries and the associated emotions. She agreed with more than 94 % of the pre-validated entries (those found by a majority of translators) and less than 18 % of the remaining entries (those found by very few translators). This result highlights that online tools can be used to get high quality resources with low cost. Annotating a subset of terms by three different annotators shows that the associated sentiments and emotions are consistent. Finally, extensive experiments have been conducted to compare the final version of FEEL with other existing French lexicons. Various French benchmarks for polarity and emotion classifications have been used in these evaluations. Experiments have shown that FEEL obtains competitive results for polarity, and significantly better results for basic emotions.

Encoding emotion in Chinese: a database of Chinese emotion words with information of emotion type, intensity, and valence

Article Open access 19 October 2016

Using Google n-Grams to Expand Word-Emotion Association Lexicon

Chinese Emotion Lexicon Developing via Multi-lingual Lexical Resources Integration

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Automatic text analysis to detect the presence of subjective meanings, their polarity (positive, negative and neutral), the associated emotions (joy, anger, fear, etc.) as well as their intensity has been extensively investigated in the last decade. Called Sentiment or Opinion mining, they have a great deal of interest for real applications such as: managing customer relations (Homburg et al. 2015), predicting election results (Lewis-Beck and Dassonneville 2015), etc. Actually, even dedicated API or applications have been proposed and included in well-known systems. For instance, Google Prediction API includes a sentiment analysis module^{Footnote 1} that can be used to build sentiment analysis models. Applied methods usually depends on the nature of the texts: tweets (Velcin et al. 2014), mails (Pestian et al. 2012), news headlines (Rao et al. 2013), etc., and obviously on the application domain: politics (Anjaria and Guddeti 2014), environment (Hamon et al. 2015), health (Melzi et al. 2014), etc. They are often based on techniques from Statistics, Natural Language Processing and Machine Learning (ML). Supervised ML algorithms are frequently used to train text classifiers on tagged data sets. Their efficiency depends on the quality and size of the training data. However, it has been proved that the use of adapted sentiment lexicons can significantly improve the classification performances of bag of words classifiers (Hamdan et al. 2015). Indeed, recent studies suggest to include the words conveying each sentiment as descriptive features when learning text classification models (Mohammad et al. 2015).

Sentiment lexicons organize lists of words, phrases or idioms into predefined classes (polarities, emotions, etc.) (Devitt and Ahmad 2013; Turney 2002). For example, in NRC-EmoLex (Mohammad and Turney 2013), starting point of this study, terms like happy and heal are labeled as positive, while terms like abandon and hearse are labeled as negative. Whereas each term has only one polarity, some terms may convey many emotions according to the used emotional typology. For example, in NRC-EmoLex, the word happy is associated with the emotions joy and trust, while the word hearse is associated with sadness and fear. Many emotion typologies exist in the literature (Ekman 1992; Francisco and Gervás 2006; Pearl and Steyvers 2010; Plutchik 1980). The most famous and at the same time the simplest typology among them is the one proposed by Ekman consisting in six basic emotions: joy, surprise, anger, fear, sadness and disgust. It has been considered in much of emotion classification studies (Mohammad and Kiritchenko 2015; Roberts et al. 2012; Strapparava and Valitutti 2004).

To date, most existing affect lexicons have been created for English and for polarity. In this paper, we describe the elaboration of a new French lexicon containing more than 14,000 terms according to their polarities (positive and negative) and their expressed emotions (we consider the Ekman basic emotions). The applied method is based on the automatic translation and expansion to synonyms of NRC-EmoLex, a publically available^{Footnote 2} emotion lexicon which has proven its performance in several sentiment and emotion classification tasks (Kiritchenko et al. 2014; Mohammad 2012; Rosenthal et al. 2015). The translations have been obtained automatically by queering six online translators. An experienced human translator has validated the obtained entries as well as the associated emotions. She accepted more than 94 % of the automatically pre-validated entries (those found by at least three online translators) and less than 18 % of the remaining entries (those found by less than three online translators). Therefore, we believe that the proposed approach can be used to build high quality resources with low cost. Finally, in order to evaluate its quality, experiments for classification tasks (polarity and emotion) have been conducted with well-known French benchmarks. Results have shown that we obtain comparable scores for polarity classification comparing to the existing lexicons. More interestingly, we have shown that with FEEL clearly better results have been obtained for emotion classification when considering the available Ekman basic emotional classes. This result highlights that our resource is well adapted for polarity and emotion classifications. It can be accessed and downloaded publically on the internet^{Footnote 3} (Abdaoui et al. 2014).

The rest of the paper is organized as follows. Section 2 discusses a study of existing sentiment and emotion lexicons for both English and French. Section 3 describes our approach for automatically building a French lexicon as well as the manual validations. Section 4 compares FEEL with other existing French lexicons and shows their results in emotion and polarity classification tasks. Finally, Sect. 5 concludes and gives our main prospects.

2 Related work

Sentiment lexicons can be constructed using three main approaches (Pang and Lee 2008). First, they can be compiled manually by assigning the correct polarity or emotion conveyed by each word. Crowdsourcing tools and serious gaming are often used to get a large number of human annotations. (Mohammad and Turney 2013) used the Amazon Mechanical Turk^{Footnote 4} service, while (Lafourcade et al. 2015a, b) designed an online Game With a Purpose (Like it!^{Footnote 5}). Second, they can be compiled automatically using dictionaries. This approach uses a small set of seed terms for which the conveyed sentiments are known. Then, it grows the seed set by searching synonyms and antonyms using dictionaries (Strapparava and Valitutti 2004). Finally, the third approach constructs sentiment lexicons automatically using corpora in two possible ways. On one hand, it can use annotated corpora of text documents and extract words that are frequent in a specific sentiment class and not in the other classes (Kiritchenko et al. 2014). On the other hand, it can use non-annotated corpora along with a small seed words list in order to discover new ones following their collocations (Harb et al. 2008) or using specifically designed rules (Neviarouskaya et al. 2011). However, each of these approaches has its own limitations. The manual approach is labor intensive and time consuming, while the automatic ones are error prone. In our case, we combine an automatic dictionary based approach with human manual annotation and supervision. Regarding the used sentiment and emotional typology, we have chosen the one proposed by (Ekman 1992) consisting of two polarities (positive and negative) and six basic emotion classes (joy, surprise, sadness, fear, anger, disgust).

Few French resources have been proposed, especially those dealing with emotions. Table 1 presents four French sentiment lexicons that we have found in the literature. If all of them offer the sentiment polarity, only two consider the exact emotional category. The Affects lexicon (Augustyn et al. 2006) which contains only around 1200 terms associated with more than 45 hierarchical emotions and Diko (Lafourcade et al. 2015b) which contains about 450,000 non-lemmatized expression but associated with almost 1200 emotion terms (many synonyms exist). The two remaining lexicons CASOAR (Asher et al. 2008) and Polarimots (Gala and Brun 2012) consider only the polarity and not the emotion. Furthermore, CASOAR is not publically available making the number of truly exploitable French sentiment resources equal to three.

Table 1 Existing French resources for sentiment polarity and emotion

FEEL: a French Expanded Emotion Lexicon

Abstract

Similar content being viewed by others

Encoding emotion in Chinese: a database of Chinese emotion words with information of emotion type, intensity, and valence

Using Google n-Grams to Expand Word-Emotion Association Lexicon

Chinese Emotion Lexicon Developing via Multi-lingual Lexical Resources Integration

Explore related subjects

1 Introduction

2 Related work

3 Methods

3.1 Automatic creation

3.2 Validating the translations

3.3 Evaluating the sentiments

4 Evaluations

4.1 Lexicons

4.2 Evaluation benchmarks

4.3 Evaluation in a polarity classification task

4.4 Evaluation in an emotion classification task

5 Conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation