Keywords

1 Introduction

The Big Five personality model [4] comprises five fundamental categories of personality - Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness to experience - which are further divided into dozens of more specific facets. For instance, the Neuroticism category includes facets representing Anxiety, Depression etc. Big Five categories are strongly correlated to (and possibly defined by) language use and, as a result, the recognition of an individual’s personality traits from text is a well-established task in the Natural Language Processing (NLP) field [14].

Models for the recognition of personality traits from text are usually based on supervised machine learning methods that take as an input a text corpus labelled with personality scores. These scores, in turn, are computed from a range of personality inventories (or questionnaires) such as the BFI-44 inventory [7]. The BFI-44 consists of a relatively short, 44 multiple-choice inventory conveying short items such as ‘I see myself as someone who is depressed, blue’. Items are to be answered on a zero (disagree strongly) to five (agree strongly) scale.

Knowing the five fundamental categories of personality of an individual may be sufficient for a number of practical applications. For others, however, a more detailed assessment of personality facets may be called-for. Assessing personality facets usually involves the use of a more extensive personality inventory, such as the 260-item NEO-PI-R [8]. From a computational perspective, however, large or complex inventories of this kind may be impractical, which may explain why studies on personality recognition from text [9, 11, 14, 17] are usually limited to the five main personality categories obtainable from short inventories such as the BFI-44.

Despite these difficulties, a compromise between convenience (as in the BFI-44) and expressiveness (as in NEO-PI-R) may still be possible. In particular, we notice that the work in [18] proved evidence that, although most facets cannot be explicitly captured by the BFI-44, a small subset of 10 facets (two from each of the main Big Five factors) are inferable from this short scale. Thus, it may be possible to obtain at least some of the facet labels available from NEO-PI-R at a much lower cost.

Based on these observations, the actual NLP question to be investigated in this paper is whether the 10 additional facets proposed in [18] may be automatically recognised from text labelled with BFI-44 information only. To this end, we developed a series of binary classifiers for Big Five facet recognition from a labelled corpus of Brazilian Facebook status updates, and we present reference results for further studies in this field. To the best of our knowledge, our work is the first attempt to learn personality facets in this way, and it is most likely the first of its kind to be devoted to the Brazilian Portuguese language.

2 Related Work

We are not aware of any large-scale work on Big Five facet recognition from text, but there is a wide range of studies focused on the more general task of recognising its main five personality categories. Given that the applicable methods are presumably similar, in what follows we briefly review a number of instances of the latter.

The work in [9] presents a comprehensive view of the personality recognition task from multiple computational perspectives (i.e., as classification, regression and ranking tasks), by comparing the use of written essays and speech corpus as input data, and by comparing the use of self-reported Big Five scores and those produced by specialists, among other issues. The study makes extensive use of psycholinguistic features provided by the LIWC [12] and MRC [3] databases, and results suggest that using ranking algorithms, speech as input data, and personality reports produced by specialists work best.

Contrary to the use of psycholinguistics-motivated features in [9] and others, the work in [11] makes use of n-gram models to classify extremes of personality using both Naive-Bayes and SVM models. Evaluation based on a corpus of personal blogs achieves maximum accuracy of 65%.

In the context of the PAN-CLEF shared task series [14], a number of supervised models of personality recognition based on Twitter data labelled with personality scores obtained from a 10-item Big Five inventory have been developed. These include the overall winner of the competition [1], which combines second order attributes with a LSA text representation; the work in [5], which makes use of char and POS n-gram models, and the work in [19], which makes use of TF-IDF counts and stylistic features. For details, we refer to [14].

3 Personality Facet Recognition

The present study aims to compare a number of models of personality facet recognition from text. More specifically, we consider the set of 10 personality facets that, according to the method discussed in [18], may be inferred from the BFI-44 inventory [7]: Assertiveness and Activity facets (under the main Extraversion category), Altruism and Compliance (under Agreeableness), Order and Self-discipline (under Conscientiousness), Anxiety and Depression (under Neuroticism), and Aesthetics and Ideas (under Openness to experience).

The method proposed in [18] consists of a series of theoretically-motivated calculations (in addition to those already performed to obtain the basic Big Five personality scores) over the set of 44 responses provided by the BFI-44 inventory. Thus, provided that the full set of BFI-44 responses about an individual is known, computing these 10 additional facet scores is straightforward.

For instance, according to [18], the Activity facet of the Big Five Extraversion category is defined as the simple average of two of the BFI-44 scores from which the main Extraversion score is obtained in the first place. In the present work, these facet scores are therefore taken as given, and we do not discuss the underlying method to obtain them. For details, see [18].

Following existing work on Big Five personality recognition for the English language and others [9, 11], personality facet recognition is presently regarded as a set of independent binary classification tasks. To this end, a document is to be labelled as a positive instance of a given facet if the corresponding author shows an above-average score for that facet when considering the entire set of authors in the domain. Since personality facets are, by definition, independent from each other [4], each document is to be assigned ten individual labels corresponding to each facet, which are to be classified one at a time.

4 Experiment

4.1 Overview

We devised an experiment to compare three binary classifiers for personality facet recognition from text:

  • BoW: bag-of-words features from the 3000 most frequent words in corpus

  • skip: average word vectors obtained from a skip-gram-1000 model

  • cbow: average word vectors obtained from a cbow-1000 model

The Bow model is built using Naive Bayes classification. Both skip and cbow models are built using logistic regression and pre-trained word embeddings computed from a 150-million Brazilian Twitter corpus using word2vec [10] with window size = 5 and min_count = 10. In addition to these three classifiers, we also consider a simple Majority class baseline system for illustration purposes.

4.2 Data

We use the 2.2 million-words b5-post corpus of Brazilian Facebook [13], conveying 194k status updates written by 1019 users, which are accompanied by self-reported BFI-44 [7] inventories filled-in by every user. The b5-post corpus has been previously taken as the input to a number of author profiling tasks [6], including personality recognition [17].

The text portion of the corpus was subject to basic spell checking and term substitution (e.g., laugh expressions such as ‘haha’ were replaced by a common $LAUGH$ symbol etc.) From the corpus inventories, 10 additional personality facets were inferred according to the method in [18]. This information constitutes the set of ten class labels for each document as discussed in the previous section.

4.3 Procedure

All models were built using 10-fold cross validation over the entire b5-post dataset. However, since that we now intend to learn ten (facet) classes, and not only five (main categories), and since many facets may be considerably more sparse than others (e.g., the Depression facet of Neuroticism may be naturally less common than, say, Self-consciousness), data imbalance is a major concern to our work. As a means to alleviate this, we resort to SMOTE minority sampling [2] with \(k=5\) neighbours.

5 Results

Table 1 shows reference results for the majority class baseline, and for the three models of interest. The first column represents mean F1 scores over the ten classification tasks, followed by the number of times (wins) in which each model was the overall winner, and the mean F1 measure for each individual class.

Table 1. 10-fold cross validation mean F1 scores.

Although all models present a considerable improvement over our admittedly simple baseline, the distinction among them is narrow, particularly between BoW and skip. A slight advantage of the cbow model over the others is however noticeable in the number of classes (wins) for which cbow was the overall winner (7 out of 10 classification tasks).

As it is usually the case in personality classification, some personality traits tend to be more evident from text than others. In the present setting, we notice that Compliance and Depression recognition were the most challenging tasks. However, it remains unclear whether these facets are less explicit in language use in general, or simply less explicit in our Facebook domain.

Finally, we notice that the present results are generally similar to those observed in Big Five personality classification in English [9] and other languages, and also along the lines of previous studies on the recognition of the main Big Five categories from the b5-post corpus [15, 16].

6 Final Remarks

This paper presented a number of models of Big Five facet recognition from a Brazilian Portuguese Facebook corpus and corresponding BFI-44 information. Our study suggests that, not unlike basic Big Five categories, the ten facets proposed in [18] may be recognised from text with reasonable accuracy if compared to a simple baseline system. In other words, our experiments suggest that we may in principle develop supervised models of personality recognition at a level of abstraction more specific than those obtainable from existing work, and without resorting to larger or more complex inventories to provide the required text labels.

The current work provides only initial reference results for further studies in this field, and a number of possible improvements are left as future work. In particular, we envisage the use of larger word embedding models and alternative learning architectures for this task, and further evaluation work by directly comparing our results against text labelled with actual facet information.