Keywords

1 Introduction

The messages posted in social networks provide a solid background about the ideas and opinions not only about the users of the social networks but also about the environment where they live. This information can be used and consumed by a wide range of institutions and organisms for strategic decision making.

Nowadays, Twitter is one of the most popular online social networking and micro-blogging services that enables its users to send and read text-based posts of up to 140 characters, known as tweets. Millions of users use Twitter to keep in touch with friends, meet new people and discuss about everything [1]. Companies are increasingly using Twitter to advertise and recommend products, brands, and services; to build and maintain reputations; to analyze users’ sentiment regarding their products or those of their competitors; to respond to customers’ complaints; and to improve decision making and business intelligence [2].

Several pieces of research have been conducted in recent years in order to automatically process the information on social networks [36]. An outstanding issue which provides research opportunities is trending topic detection. In the context of Twitter, trending topics represent the popular “topics of conversation”, among its users [7]. Monitoring and analyzing this rich and continuous flow of user-generated content can yield valuable information. However, most works about trend detection fail to take sentiment into consideration. Sentiment analysis gives an effective and efficient means to expose public opinion timely which gives vital information for decision making in various domains.

In this work, we propose an approach, known as SentiTrend, to trending topic detection on Twitter and its subsequent polarity detection. SentiTrend collects messages from Twitter and processes them in order to determine their trending topic based on the TF-IDF (Term Frequency–Inverse Document Frequency) model. Then, an estimated positive, negative or neutral sentiment value is assigned to each tweet related to the trending topic detected. The task of assigning a sentiment value to a tweet is done using a free software from Stanford University, known as Stanford Classifier. The Stanford Classifier is a Java-based implementation of a maximum entropy classifier, which takes data and applies probabilistic classification [8].

On the other hand, it is worth mentioning that studies exclusively deal with the English language, perhaps owing to the lack of resources in other languages. Considering that the Spanish language has a much more complex syntax than many other languages, and that it is the third most widely spoken language in the world, we firmly believe that the computerization of Internet domains in this language is of utmost importance. For this reason, this work is mainly motivated in the Spanish language.

This paper is structured as follows: Sect. 2 presents the state of the art on sentiment analysis and trend detection on social networks. Section 3 presents the architecture and functionality of our proposal. Section 4 shows a set of experiments carried out to validate the proposed approach concerning the effectiveness of our approach to trend detection and sentiment analysis. Finally, Sect. 5 describes our conclusions and future work.

2 Related Work

2.1 Sentiment Analysis

In recent years, several researchers have introduced methods for sentiment classification. Most of these efforts are based on two approaches: the semantic orientation approach and the machine learning approach. Both approaches have their advantages and drawbacks. The semantic orientation approach is based on opinion words, namely, words that are commonly used in expressing positive or negative sentiment. Opinion words are typically contained in a dictionary called opinion lexicon.

For example, Ghosh & Animesh [9] presented a rule-based method that can be used to identify the sentiment polarity of opinion sentences. They use SentiWordNet to calculate the overall sentiment score of each sentence. The results obtained in this work indicate that SentiWordNet could be used as an important resource for sentiment classification tasks. Peñalver-Martínez et al. [10], meanwhile, propose an innovative opinion mining methodology that takes advantage of new semantic Web-guided solutions to enhance the results obtained with traditional natural language processing techniques, sentiment analysis processes and Semantic Web technologies. Their proposal is specifically based on three different stages: (1) an ontology-based mechanism for feature identification, (2) a SentiWordNet-based technique to assign a polarity to each feature, and (3) a new approach for opinion mining based on vector analysis. Montejo-Ráez et al. [11] presented an unsupervised approach for polarity classification in Twitter. They integrated SentiWordNet to compute the final value of polarity. The synsets values are weighted with the PageRank scores obtained in the random walk process over WordNet.

However, tweets are not considered ‘‘normal’’ pieces of text, since the 140-character threshold imposes limitations in the length. A further peculiarity of the tweets is the extensive usage of jargon expressions, abbreviations, and emoticons. A disadvantage is the fact that jargon expressions are often domain dependent. These factors lead to a low recall when the lexicon-based method is applied on informal corpora of text, like posts from micro-blogs [3].

An alternative approach is the application of machine learning techniques. This approach is based on using a collection of data to train classifiers. The drawback of the machine learning-based methods is mainly focused on the manual labeling required over massive sets of tweets. However, several pieces of research showed that the machine learning approach outperforms the semantic orientation approach [12].

For example, Mohammad et al. [13] propose a basic automatic system to classify tweets and determine who is feeling certain emotion, and towards whom. They trained a Support Vector Machine (SVM) classifier for that. Sidorov et al. [14], meanwhile, examine how classifiers work while carrying out opinion mining of Spanish Twitter data. They explore how different settings (n-gram size, corpus size, the number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect the precision of the machine learning algorithms and experiment with Naïve Bayes, Decision Tree, and Support Vector Machines. Some other works [15, 16] combine NLP (Natural Language Processing) and machine learning techniques in order to increase the effectiveness of their method. Salas-Zárate et al. [15] present a method that uses a hybrid feature extraction method based on POS (part-of-speech) pattern and dependency parsing. The features obtained are enriched semantically through common sense knowledge bases. Then, a feature selection method is applied to eliminate the noisy and irrelevant features. Finally, a set of classifiers is trained in order to classify unknown data. Habernal et al. [16] present in-depth research on supervised machine learning methods for sentiment analysis of Czech social media. They explore different pre-processing techniques and employ various features and classifiers. The authors also experiment with five different feature selection algorithms and investigate the influence of named entity recognition and preprocessing on sentiment classification performance.

Finally, some other more recent proposals are based on psycholinguistic tools for sentiment analysis such as LIWC [17, 18].

2.2 Trend Detection

There are several studies that have addressed trend detection on social networks. For example, [19] present a trend prediction method for news event or news topics on Twitter. Experimental results show that the method is simple and effective. The authors also propose and analyze several possible reasons for the trend rising and falling of news topics on Twitter. Kaleel & Abhari [6] propose a system for event detection and trending from tweet clusters which are discovered using LSH (Locality Sensitive Hashing) technique. Specifically, the key issues addressed by the authors are: (1) construction of a dictionary using incremental term TF–IDF in high dimensional data to create tweet feature vector, (2) leveraging LSH to find truly interesting events, (3) trending the behavior of events based on time, geo-locations and cluster size, and (4) speed-up the cluster discovery process while retaining the cluster quality. Ding et al. [20], meanwhile, focus on automated personalization of tweets for popular trending topics. The main objective is to classify the tweets information as “Like” or “Dislike” on a particular topic based on personal preferences. Martinez-Romo & Araujo [1] present a methodology based on two main aspects: the detection of spam tweets in isolation and without previous information of the user; and the application of a statistical analysis of language to detect spam in trending topics. Mathioudakis & Koudas [21] present TwitterMonitor, a system that performs trend detection over the Twitter stream. The system identifies emerging topics on Twitter in real time and provides meaningful analytics that synthesize an accurate description of each topic. Users interact with the system by ordering the identified trends using different criteria and submitting their own description for each trend.

We should state that our work differs from the existing works for two reasons: (1) Our approach is based on the Spanish language in comparison to most studies that exclusively deal with the English language, and (2) Our approach obtains an estimated positive, negative or neutral sentiment value for each tweet related to the trending topic detected.

3 SentiTrend Architecture

SentiTrend consists of a Back-End and a Front-End layer (see Fig. 1). On the one hand, The Back-End is divided into five main components: (1) Retrieve module, (2) Data extraction and verification module, (3) Pre-processing module, (4) Trend detection module, and (5) Sentiment analysis module. On the other hand, the Front-End consists of a web application developed in Java that allows users to view the trending topics detected, and the tweets corresponding to the selected trending topic including the percentage of positive, negative or neutral tweets.

Fig. 1.
figure 1

SentiTrend’s architecture.

3.1 Back-End Layer

As was previously mentioned, the Back-End layer is divided into five main components:

  1. 1.

    Retrieve module. This module is responsible for establishing, maintaining a connection with Twitter and retrieving tweets.

  2. 2.

    Data extraction and verification module. It avoids storing repetitive tweets and extracts valuable information from the tweets such as user, text, followers, etc.

  3. 3.

    Pre-processing module: This module carries out the data cleansing of each tweet by means of NLP techniques. In other words, it removes from the tweets information such as hyperlinks, emoticons, among others.

  4. 4.

    Trend detection module. It performs the trend detection based on a TF-IDF model.

  5. 5.

    Sentiment Analysis module. This module classifies tweets as positive, negative, or neutral.

A detailed description of the modules contained in the architecture shown above is provided in the following sections.

Retrieve module.

This module handles establishing and maintaining the connection with Twitter servers to retrieve tweets. We use Twitter4 J, a Java library that gives access to the Twitter API and assists in integrating the Twitter service into any Java application. In order to obtain useful results, we have established two search filters: (1) Track, and (2) Locations. The first filter consists of a comma-separated list of phrases which will be used to determine which tweets will be delivered on the stream. The second filter consists of a comma-separated list of longitude, latitude pairs specifying a set of bounding boxes to filter Tweets. Each bounding box should be specified as a pair of longitude and latitude pairs, with the southwest corner of the bounding box coming first. For example, to obtain the tweets from Spain, we need the following coordinates:

Data extraction and verification module.

In this module, information about each tweet is extracted. Also, a verification process is carried out. Next, a detailed description of the process performed is presented.

  1. 1.

    It obtains a tweet of the tail of tweets.

  2. 2.

    This module retrieves the tweet information namely, id, date, number of retweets, text, language, the user who wrote it, number of the user’s followers and hashtags and users that appear in the tweet.

  3. 3.

    It verifies if a tweet is original or a retweet

    1. (a)

      If a tweet is original, it verifies if it exists in the database with its id

      1. (i)

        If the tweet is not in the database, a sentiment classification is carried out (“sentiment analysis module”), and all information is stored in the database.

      2. (ii)

        Otherwise, data such as number of followers, number of retweets are updated in the database.

    2. (b)

      If a tweet is a retweet, the original tweet is obtained, as well as its id, date, number of retweets, text, language, the user who wrote it, the user’sfollowers, the user name, hashtags, and users named in the tweet.

      1. (i)

        If the original tweet is not in the database, a sentiment classification is carried out (“sentiment analysis module”), and all information is stored in the database.

      2. (ii)

        Otherwise, data such as number of followers and number of retweets are updated in the database.

Pre-processing module.

The pre-processing module carries out the data cleansing of each tweet by means of NLP techniques [22, 23].

The system carries out the following steps before extracting features from the text of the tweet.

  • Slang words translation: Tweets often contain slang words. Slang word translation means converting the slang words like lol, omg, among others, into their standard form.

  • Tokenization: The sentences are divided into words or tokens by removing white spaces and other symbols or special characters.

  • Case Normalization: The process is to turn the entire tweet into lowercase.

  • Stemming: It is the process of reducing all the remaining words to their respective stems. It is worth remarking that stemming finds the stem, and not the root of the words.

  • The removal of Stop Words: A stop word is defined as a word that contains no meaning or relevance in and of itself. All words that appeared as the most frequent in at least 80 % were classified as stop words. If a word was identified as a stop word, it was removed.

  • Identify presence of URL using a regular expression (“https?://\\S+\\s”) and remove all the URLs from the tweet.

  • Remove all the private usernames identified by @user and the symbol # of hashtags.

Trend detection module.

This module performs trend detection through two main phases. Firstly, a set of simple and composite features are extracted. This process is performed by using n-grams (like unigrams, bigrams and trigrams) [24]. For example, the features obtained from the sentence “big bang theory” are the following:

Secondly, in order to calculate the weight of the words, a TF.IDF model is used. TF-IDF is a statistical measure that is used to estimate the importance of a word in a document or in a collection of documents [6, 25]. Having said that, term frequency can be defined as:

$$ tf_{ij} = \frac{{n_{ij} }}{\text{N}} $$
(1)

where nij is the number of times word i occurs in document j and N is the total number of words in document j.

$$ N = \mathop \sum \limits_{k} n_{kj} $$
(2)

The second definition is often referred to as the normalized term frequency. Inverse document frequency is defined as

$$ idf_{i} = \log \left( {\frac{D}{{d_{i} }}} \right) $$
(3)

where di is the number of documents that contain word i and D is the total number of documents.

Therefore, the TF-IDF score for a word w in a document d is calculated by:

$$ tf - idf = tf_{ij} * idf_{i} $$
(4)

Sentiment analysis module.

The last module provides the negative, positive or neutral polarity of the tweets. Aiming to perform such a task, this module needed the previous development of a Machine Leaning-based module able to determine the polarity of a tweet, i.e. to determine if a tweet is positive, negative or neutral concerning a topic. The development of this module involved two main phases. Firstly, a corpus consisting of 1000 positive tweets, 1000 negative tweets, and 1000 neutral tweets was obtained. We used the Twitter API to collect the tweets. After downloading the tweets, each tweet was individually processed as described in a “pre-processing module” section. Also, we performed a manual review of the filtered tweets in order to make sure that the obtained tweets are relevant to our study. Finally, each tweet was classified by hand in order to ensure the quality of the corpus. This time-consuming task was performed in a period of 12 months by a group of five people with a great experience in the sentiment classification domain. We do not share this dataset the Twitter policy does not all us to share tweets contents.

Secondly, the corpus mentioned above was used to training a classifier, more specifically, we use the Stanford classifier, a Java-based implementation of a maximum entropy classifier, which takes data and applies probabilistic classification. We applied a Machine Learning (ML) approach as it has been applied in several works, achieving great results for sentiment classification. The Machine learning methods often rely on supervised classification approaches. This approach is based on using a collection of data to train the classifiers. Among the machine learning techniques commonly used in sentiment polarity classification we find Support Vector Machine (SVM) [26, 27], Naive Bayes (NB) [28, 29], and Maximum Entropy (MaxEnt) [30].

3.2 Front-End Web Application

SentiTrend provides a web interface where users can carry out the following tasks: (1) view the recent trends in real-time, (2) view tweets about a selected trend, as well as, the percentage and total of positive, negative, and neutral tweets.

Next, Fig. 2 shows the SentiTrend Web application.

Fig. 2.
figure 2

SentiTrend Web application.

4 Evaluation and Results

In order to evaluate the effectiveness of the system for sentiment classification and trend detection, we have used three evaluation measurements: precision, recall and F-measure. Recall (5) is the proportion of factual positive cases that were correctly predicted as such. On the other hand, precision (6) represents the proportion of predicted positive cases that are actually positive. Finally, F-measure (7) is the harmonic mean of precision and recall [31].

$$ Recall = \frac{TP}{TP + FN} $$
(5)

where TP is the number of true positives and FN is the number of false negatives.

$$ Precision = \frac{TP}{TP + FP} $$
(6)

where TP is the number of true positives and FP is the number of false positives.

$$ {\text{F}}1 = 2 *\frac{\text{Precision*Recall}}{{{\text{Precision}} + {\text{Recall}}}} $$
(7)

The experiments carried out in this study are described in detail below.

4.1 Trend Detection

In order to evaluate the effectiveness of the system for trend detection, several experiments were carried out. The experiments involve obtaining results for different time intervals with our system (SentiTrend), Twitter API, and Trends24, and then, carry out a comparison between the results obtained by the aforementioned tools. The experiments were evaluated using precision, recall, and F-measure metric. Aiming to calculate the corresponding scores, the following facts are considered.

  • True positives: the items that were identified as trending topic by SentiTrend, Twitter API and Trends24.

  • False positives: the items identified as trending topic by SentiTrend that were not identified as trending topic by Twitter API and Trends24.

  • False negatives: the items identified as trending topics by Twitter API and Trends24 that were not identified as trending topics by SentiTrend.

Table 1 shows precision (P), recall (R), and F-measure (F) results obtained by SentiTrend and Twitter API tools and SentiTrend and Trends24 tools for different time intervals.

Table 1. Trend detection results obtained by SentiTrend.

As can be seen in Table 1, SentiTrend obtains good results for trend detection with an average precision, recall, and F-measure of 80.0 % with regards to Twitter API, and 75.5 % with respect to Trends24. In fact, the best result (precision, recall, and F-measure of 90 %) was obtained by several tests for both SentiTrend-Twitter and SentiTrend-Trends24. Also, the results show that SentiTrend obtains more matches of trending topics with Twitter than with Trends24.

4.2 Sentiment Classification

The experiments of sentiment analysis were carried out on a set of tweets related to a trending topic. For this purpose, the trending topic with the highest score obtained by SentiTrend for each of the twenty case studies presented in the previous section (see Sect. 4.1) was selected. Then, 300 tweets related to the trending topic were collected. Each of them was classified as positive, negative, or neutral by both, an expert group on sentiment analysis and the SentiTrend system.

Finally, a comparison of the results obtained by the aforementioned methods was carried out through precision, recall, and F-measure metrics. The evaluation results are shown in Table 2.

Table 2. Sentiment classification results

As can be seen in Table 2, the system provides encouraging results for sentiment classification of tweets in the Spanish language, with average Precision, Recall and F-measure values of 81 %, 80.7 %, and 80.7 %, respectively.

4.3 Discussion

General results show that the system successfully performs trend detection in Twitter and polarity detection.

With respect to trend detection, experiments show that the method is effective. However, much remains to be done about this topic. For example, we believe that analyzing fake content on twitter would be an interesting factor.

Regarding sentiment analysis, the system provides encouraging results. As mentioned above, we have used the Stanford Classifier to perform MaxEnt (Maximum Entropy) classification. The results obtained for the MaxEnt are very good. These results can be justified by the analysis presented in [32], where the authors mention that MaxEnt has been successfully employed for natural language processing tasks since the main advantages of MaxEnt are its robustness and statistic efficiency. However, it would be interesting to carry out several experiments with other classifiers such as SVM, BayesNet by using some machine learning tools, such as Weka [33] and RapidMiner [34], aiming to compare the results provided by several algorithms.

5 Conclusions and Future Work

In this work, we have proposed SentiTrend, a system for trend detection and sentiment analysis. We have also presented the experiments whose objective was to evaluate the proposed approach concerning trend detection and sentiment analysis. Our proposal yielded encouraging results, with an average F-measure of 80.7 % for sentiment analysis, and an average F-measure of 80 % and 75.5 % for trend detection with regards to Twitter API and Trends24, respectively.

In spite of all the advantages and possibilities of the proposed approach, it has several limitations that could be improved in future work. First, our approach is only able to deal with tweets expressed in Spanish, which is a disadvantage owing to the vast amount of information available in other languages. We shall therefore attempt to apply this approach to the English language. Second, our approach is not able to detect irony, sarcasm, and satire. These aspects can play the role of a polarity reverse, with respect to the words used in the tweet. This is one of the most interesting aspects to check in social media for sentiment analysis. We plan to integrate a module to detect irony, sarcasm, and satire. Third, in order to train and validate the sentiment analysis method, we collected a corpus, which was manually labeled by an expert group. However, we plan to use a standard/benchmark corpus in order to evaluate the effectiveness of our method and compare our results with other proposed works. Finally, another disadvantage of our proposal is that it is not able to identify spam users as well as spam tweets. Trending topics are a very effective method for tricking users into visiting malicious or spam websites. Accordingly, the attackers collect information regarding the most popular trending topics and include them in tweets pointing to spam websites. Therefore, we are interested in carry out a study regarding spam propagation through Twitter such as that presented in [35].