Application of Deep Learning Approaches for Sentiment Analysis

Pathak, Ajeet Ram; Agarwal, Basant; Pandey, Manjusha; Rautaray, Siddharth

doi:10.1007/978-981-15-1216-2_1

Part of the book series: Algorithms for Intelligent Systems ((AIS))

3112 Accesses
12 Citations
1 Altmetric

Abstract

Social media platforms, forums, blogs, and opinion sites generate vast amount of data. Such data in the form of opinions, emotions, and views about services, politics, and products are characterized by unstructured format. End users, business industries, and politicians are highly influenced by sentiments of the people expressed on social media platforms. Therefore, extracting, analyzing, summarizing, and predicting the sentiments from large unstructured data needs automated sentiment analysis. Sentiment analysis is an automated process of extracting the opinionated from data and classifying the sentiments as positive, negative, and neutral. Lack of enough labeled data for sentiment analysis is one of the crucial challenges in natural language processing. Deep learning has emanated as one of the highly sought-after solutions to address this challenge due to automated and hierarchical learning capability inherently supported by deep learning models. Considering the application of deep learning approaches for sentiment analysis, this chapter aims to put forth taxonomy of traits to be considered for deep learning-based sentiment analysis and demystify the role of deep learning approaches for sentiment analysis.

Access provided by Autonomous University of Puebla. Download chapter PDF

Deep Learning in Sentiment Analysis

Sentiment analysis using deep learning architectures: a review

Article 02 December 2019

Recent advances in deep learning based sentiment analysis

Article 15 September 2020

Keywords

1 Introduction

The drastic shifts from read-only to read-write access to the Web lead the people to interact with each other through social media networks like wikis, blogs, online forums, communities, etc. Due to this, user-generated content through social media platforms is increasing tremendously. Specifically, Web-based data of the form—opinionated text, reviews of products, and services has been one of the most contributing factors in social big data [1].

Analyzing the sentiments of people from such opinionated data helps both end users and business industries in decision-making for purchasing products, launching new products, assessing the industry reputation among the customers, etc. Sentiment analysis, also termed as opinion mining, is an automated process of extracting the polarity of the opinionated text. Alongside polarity, subject and opinion holders can also be identified using sentiment analysis. Sentiment analysis is one of the most active research areas in natural language processing since 2000 and continues to be highly sought-after research domain. It is forecasted that by 2025, NLP market would reach $22.3 billion [2].

Because of the proliferation of diverse opinion sites, it is difficult to find and monitor all the sites and collect the information pertaining to some domain and perform sentiment analysis. Moreover, it is difficult for human personnel to segregate the opinionated data from long blogs and forums and summarize the opinions. This arise the need of automated sentiment analysis systems.

Numerous techniques have been put forth till date to perform sentiment analysis based on supervised and unsupervised learning. In supervised learning, early literature focused on applying supervised machine learning techniques like naïve Bayes, support vector machines, and feature learning algorithms [3]. Unsupervised learning methods include the use of sentiment lexicons, grammatical analysis, etc.

Deep learning has emanated as a powerful technique to solve multitude of problems in the domains of computer vision [4,5,6,7,8], topic modeling [9,10,11], natural language processing [12,13,14], speech recognition [15], social media analytics [16,17,18], etc. Inspired by the same, applying deep learning-based sentiment analysis achieved great popularity in the recent lustrum. This book chapter sheds light upon the progress made in deep learning-based sentiment analysis by giving an overview of deep learning-based sentiment analysis models. Figure 1 gives a glimpse of main topics covered to demystify the application of deep learning for sentiment analysis.

2 Taxonomy of Sentiment Analysis

Figure 2 shows the taxonomy of the traits to be considered while designing the sentiment analysis models.

2.1 Sentiment Analysis, Polarity, and Output

Sentiment analysis is an automated process, which predicts the polarity of the opinionated text in terms of positive, negative, and neutral [19]. Fine-grained sentiment analysis involves the following categories, viz. very positive, positive, neutral, negative, and very negative. These categories can be mapped to a rating score, for example, “very positive” can be mapped to 5 stars, whereas “very negative” to 1 star. For multiple documents, the individual polarities obtained for each document can be mapped to the ratings and then aggregated to give aggregated score.

2.2 Levels of Sentiment Analysis

Sentiment analysis is performed at various levels of granularities such as document, sentence, and aspect-based. These levels have been discussed in this sub-section.

Document level

This level determines the sentiment of a complete paragraph or a document. The sentiment analysis model assumes that document contains opinionated text about the single entity. This level does not support documents comparing the multiple entities. The problem of determining whether the document has positive or negative polarity is portrayed as a binary classification problem. It can also be handled as a regression problem, for instance, assigning the rating score in the range of 1–5 stars for movie reviews. This task can also be modeled as a five-class classification problem.

Sentence level

This level of sentiment classification aims to determine the sentiment from a single sentence. Subjectivity classification and polarity classification can be used for inferring the sentiment from a sentence. Subjectivity classification focuses on finding whether a sentence is subjective or objective. On the other hand, the polarity classification determines whether a given subjective sentence is positive or negative. Existing deep learning techniques focuses on predicting polarity of a sentence as positive, negative, and neutral. As sentences are shorter compared to the document, semantic, and syntactic features obtained via POS tagger, parse trees, and lexicons can be used for sentence-level sentiment classification. Similar to document-level assumption, sentence-level sentiment classification assumes that each sentence contains sentiment about single entity.

Aspect-based sentiment analysis (ABSA)

In this level, sentiments of the users expressed toward aspects (features) of the entities (objects) such as movie and restaurant are extracted. It aims to find the aspect and polarity pairs from a given text. This level assumes that a single entity is present per document. As mentioned in [20], aspect-level sentiment analysis can be divided into four tasks as aspect term extraction, aspect term polarity, aspect category detection, and aspect category polarity. Aspect term extraction involves identifying the aspect terms from a set of sentences with pre-defined entities (e.g., laptops) and returning the list of distinct aspect terms. The second sub-task, namely, aspect term polarity focuses on determining the polarity of the aspect term detected in the first sub-task. Aspect category detection identifies the aspect categories in each sentence based on pre-defined set of aspect categories (e.g., general, price). The fourth sub-task aspect category polarity focuses on determining the polarity of each aspect category from a given set of sentences. Table 1 gives an example and output of each sub-task in ABSA.

Table 1 Phase-wise examples in ABSA and output labels

Full size table

Targeted ABSA is an extension of aspect-based sentiment analysis. ABSA assumes the occurrence of single entity per document, whereas targeted ABSA assumes a single sentiment toward each aspect of one or more entities. Targeted ABSA extracts the target entities, different aspects and their corresponding sentiments. For example, “The ambience is good in Viceroy but the service is bad, on the other hand, the staff in Novotel is very prompt and the food is tasty as usual.” This instance talks about aspects of two different hotels. Targeted ABSA recognizes “Viceroy” and “Novotel” as two target entities and output the labels as {Viceroy, ambience, positive}, {Viceroy, service, positive}, {Novotel, service, positive}, {Novotel, food, positive}.

2.3 Domain Applicability, Training, and Testing Strategy

Domain applicability states weather the sentiment analysis model performs in-domain or cross-domain sentiment analysis. For in-domain sentiment analysis, training and testing are done on the same target domain, i.e., domain-specific training and testing strategy are applied. Sometimes, the target domain on which sentiment analysis is to be performed lacks or possesses very less labeled data associated with sentiment classes, and therefore it is difficult to train the model with such data. Therefore, domain adaptation [21] (transfer learning) technique is applied for cross-domain sentiment analysis in which a model is trained on the domain with labeled data and tested on target domain with no or very less labeled data.

2.4 Language Support

Sentiment analysis models can be categorized into monolingual, multi-lingual, and cross-lingual sentiment models based on the support for the language. Cross-lingual sentiment analysis models train the model on resource-rich language and then test on resource-poor language.

2.5 Evaluation Measures

Common evaluation metrics commonly used for sentiment analysis are accuracy, F₁ score, average recall (AvgRec), macro-average F1 score, ranking loss, macro-averaged mean absolute error, least absolute error (LAE), mean squared error (MSE), Pearson correlation coefficient, KullbackLeibler divergence (KLD), and area under the ROC curve (AUC). These metrics have been discussed in this section in Sect. 5.

3 Text Representation for Sentiment Analysis

Figure 3 depicts various traits to be considered to represent the text for sentiment analysis using deep learning. Each trait has been discussed in sub-sequent sections.

3.1 Embedded Vectors

For most machine learning algorithms, which map input to output using approximation require numerical representation of input data. Embedding methods (also named as vectorizing or encoding) convert input data (i.e., words, sentences, paragraphs, document, date, emoji, graph, etc.) into real numbers capturing the hidden semantic relation between input data. Embedding models are one of the successful applications of unsupervised learning and have been popularly used in deep learning-based NLP tasks. Bengio et al. [22] introduced the concept of word embeddings. Some noteworthy models which can be used for representing the input text have been discussed.

Collobert and Weston (C&W) model

C&W model proposed in [23] has been designed using multi-layered neural network architecture, trained on large dataset and carries syntactic and semantic meaning. This model is designed agnostic to any task-specific feature engineering and therefore serves as useful word representation model for wide variety of NLP tasks.

Word2vec

The vectors used for representing the words are neural word embeddings. Word2vec [24] is used to obtain the distributed representation of words, i.e., word embeddings. Word2vec trains the words against the other words that are neighbors of each other in the input corpus. This training can be done using any of the two models such as continuous bag-of-words (CBOW)or skip-gram model. CBOW model emits a target word according to surrounding context. Skip-gram model emits words in a surrounding context provided that central word is given.

fastText

Facebook’s AI research laboratory came up with fastText library [25]. It efficiently learns word representation. By making use of character-level information, fastText can be used to get the representation for rear words also.

Global Vectors for Word Representation (GloVe)

GloVemodel [26] gives vector representations for words in an unsupervised manner. It uses both global matrix factorization and local context window to get representation of the word.

Embeddings from Language Models (ELMo)

Traditional word embedding models like word2vec and GloVe can not handle the contextual meaning of the words and therefore provide the same vector representation for the word with different meanings. For instance, meaning of the word stick is different “stick” in the following example.

Sentence 1: This stick is made up of wooden material

Sentence 2: Let’s stick to one goal at a time

ELMo model [27] cleverly handles the multiple meanings of the words as mentioned in above sentences based on context by representing the embedded vector as a function of the entire sentence containing that word. ELMo representation can model syntactical and semantical characteristics of the word, handles words with multiple meanings based on context (polysemy modeling). Word vectors obtained from ELMo model are learned functions of the hidden states of a bi-directional language model. As ELMo vectors are character-based, the ELMo model can represent out-of-vocabulary words unseen in training phase by making use of morphological clues.

Sentiment-Specific Word Embeddings (SSWE)

Tang et al. [28] proposed SSWE model by incorporating sentiment knowledge in continuous representation of the words. For this, three neural network-based models have been designed, viz. SSWE_h, SSWE_r, and SSWE_u. SSWE_h is trained with very strict constraint to predict the positive and negative n-gram in the range [1,0] and [0,1], respectively. In SSWE_r, the strict constraint of softmax has been removed. Both SSWE_h, and SSWE_r prohibit generation of corrupted n-grams. Being unified, SSWE_u captures both the sentiments of sentences and syntactical contexts of the words.

Graphs from LOw-level unit Modeling (GLoMo)

Graphs from low-level unit modeling (GLoMo) framework is based on unsupervised latent graph learning [29]. It is also a transfer learning framework developed to improve the performance of NLP tasks like sentiment analysis, natural language inference, question answering, and image classification.

Universal Language Model Fine-tuning (ULMFiT)

ULMFiT [30] is transfer learning model which can be used for any natural language processing task. The pre-trained models of ULMFiT can be leveraged for sentiment analysis. In this, a language model is pre-trained on general domain and then fine-tuned on target domain. Its working is invariant to document size, number, and label and therefore claims to be universal. It follows a single architecture and training for carrying out diverse tasks and does not need domain-specific documents and labels.

OpenAITransformer

OpenAITransformer [31] first trains a transformer model on large carpus in an unsupervised manner using language model as a training signal. After this, fine-tuning the model on small supervised dataset enables to solve the specific task.

Bi-directional Encoder Representations from Transformers (BERT)

BERT [32] pretrains bi-directional representations of unlabeled data in all layers by jointly handling both left and right context. Due to this, it can be fine-tuned to solve any task of NLP by just adding one output layer to the pre-trained model.

3.2 Strategy of Initializing the Embedded Vectors

Table 2 gives details of pre-trained models which can be leveraged for sentiment analysis. Word embeddings can be initialized by setting the vector representations with random values (random initialization). Another way is to use pre-trained word embeddings and then fine-tune these embeddings for initializing the model.

Table 2 Pre-trained word embedding models and corpora

Full size table

Pre-trained models based on various corpora such as Wikipedia (C&W), Google News (Google), Twitter with emoticons (SSWE), Amazon corpus (Amazon), Wikipedia and Twitter (Glove) have been developed. Applying word2vec to a specific corpus yields customized embeddings [37, 38]. As mentioned in [33], random initialization may result in getting local minima with stochastic gradient descent (SGD) and if the pre-trained embeddings are not fine-tuned then automatic feature learning capacity of deep neural networks can not be leveraged. Therefore, use of pre-trained embeddings as initializer and then fine-tuning them helps to make the model efficient [39].

3.3 Enhancing the Embedded Vectors

For enhancing the effectiveness of the embedded vector, additional feature (from a word, sentence, and document) can be extracted and appended to a pre-trained embedded vector. For example, word vector can be appended with sentiment, parts-of-speech (POS) tag, word subjectivity, total count of syllables, number of characters with or without punctuation, etc.

The words which are out-of-vocabulary to the embedding model lack vector representation. For such OOV words, vector representation is obtained by approximation based on OOV word’s context. The following are some solutions to handle OOV words. (1) Specifically, given a sentence and corresponding OOV word, language modeling performs sequencing of words in sentence and then predicts the meaning of word by comparing it with similar sentences. (2) Another solution is to use character or n-gram-level embeddings obtained from fastText. (3) Embeddings can be trained from scratch on the text. However, it suffers from overfitting and can not handle sentences having complex structure. Tang et al. [40] handled the problem of OOV words for the domain of users and products by averaging the representation of available data related to users and products. Creating a domain-specific word embedding model also helps to improve the performance [28, 41, 42].

3.4 Approximation Methods

Reducing the computational complexity of final softmax layer is one of the crucial challenges to be handled while designing the better word embedding model. Therefore, approximation algorithms based on sampling and softmax-based approaches have been devised by the research community. These approaches have been discussed in this sub-section.

3.5 Sampling-Based Approaches

Sampling-based approaches approximate the normalization term present in the denominator of the softmax with other computationally inexpensive loss function. Sampling-based methods are useful only for training. During testing, the full softmax needs to be computed to get a normalized probability.

Importance sampling: Traditional importance sampling is based on Monte-Carlo sampling. It approximates a target distribution via unigram distribution.
Adaptive importance sampling: Approximation using importance sampling works better for large samples [43]. Bengio and Senécal proposed an Adaptive importance sampling [44] which works on n-gram distribution.
Target sampling: Jean et al.’s [45] approximation training algorithm is based on biased importance sampling, namely target sampling, which allows training neural machine translation model with a much large target vocabulary. Once the model is trained, they limit the target words being sampled by forming a subset of the vocabulary obtained by partitioning and selecting pre-defined sample words in each partition.
Noise contrastive estimation (NCE): NCE [46] is more stable compared to importance sampling. Importance sampling has the risk of proposal distribution getting divergent from target distribution. Compared to importance sampling, NCE does not find the probability of the word directly. NCE uses an auxiliary loss for maximizing the probability of correct words using optimization.
Negative sampling: It minimizes the negative log-likelihood of words in training set using logistic loss function and focuses on learning word-representations of high quality.

3.6 Softmax-Based Approaches

Hierarchical softmax (H-Softmax): Approximation based on hierarchical softmax [47] replaces the softmax layer with hierarchical tree in which leaves correspond to the words. Hierarchical layer decomposes the process of probability calculation. This alleviates the need of calculating the expensive normalization over the words. Therefore, it achieves a speed-up for word prediction tasks.
Differentiated softmax: Differentiated softmax [48] is a variant of traditional softmax layer. It is based on the philosophy that a number of parameters required by words are different and varies according to the occurrence of the words. Due to this principle, D-softmax works faster during testing. However, the assignment of a smaller number of parameters to rarely occurring words does not help the model to handle rare words efficiently.
CNN-softmax: Kim et al.’s [49] work focuses on modifying the traditional softmax layer using character-level convolutional neural network (CNN). Character-level CNN has been used for producing the input word embeddings. Jozefowicz et al. [50] designed softmax loss based on character-level CNN, named as CNN-softmax. However, character-based models can not handle the same words with different meanings. This is because continuous space representation is used for the characters and the model prone to learn mapping from characters to word embeddings using smooth function. Therefore, a correction factor can be introduced which is learned per word.

4 Deep Learning Approaches for Sentiment Analysis

In this section, highly significant deep learning approaches for sentiment analysis at document, sentence, and aspect-level have been discussed. Table 3 compares these approaches based on text representation, neural network model, dataset, and crux of each approach.

Table 3 Comparative study of deep learning-based sentiment analysis approaches

Full size table

Document-level sentiment analysis approaches

Zhai and Zhang [34] proposed a semi-supervised denoising autoencoder model for document-level sentiment analysis. It considers sentiment information during learning phase for getting good representation of document vectors. It learns a task-oriented data representation by using Bregman divergence function as a loss in the autoencoder and obtaining discriminative loss function from class labels.

Zhou et al. [52] proposed bilingual sentiment embeddings for cross-lingual sentiment classification. In this, denoising autoencoder is used to learn bilingual embeddings in unsupervised way. Then via supervised learning, sentiment information is incorporated into bilingual embeddings from sentiment labels of documents to get bilingual sentiment word embeddings.

For learning the document representation, Tang et al. [51] utilized the sentence relationships. For this, they first used CNN or long short-term memory (LSTM) for sentence representation learning and then applied gated recurrent unit (GRU) for adaptively encoding the semantics of sentences and their relation in document representation for sentiment analysis.

For overcoming the shortcomings of bag-of-words model, Le and Mikolov proposed unsupervised algorithm, namely paragraph vector [54] which learns fixed-length representation of text data from variable-sized text such as sentence, paragraphs, and documents. It learns representation by predicting the surrounding words based on contextual information from the text. After learning the vector representation, logistic classifier is applied to learn to predict the sentiments. During testing, the network for vector representation freezes and representation for test data (sentence, paragraph, or document) is learnt using gradient descent. The leant vector representation is then fed to logistic regression for predicting the sentient.

Tang et al. [40] proposed supervised learning framework which incorporates user- and product-level information in a neural network model to perform document-level sentiment classification. Incorporation of user-level and product-level information facilitates to capture the individual choices of users and overall qualities of products, respectively, to provide better representation of the text.

Like [51], Chen et al. [52] incorporated user- and product-level information in a hierarchical LSTM model via word and sentence-level attention mechanism. Based on the principle of compositionality [80], they modeled document semantics in a hierarchical manner at word, sentence, and document level. They used word-level user-product attention to get sentence representation and sentence-level user-product attention to get document representation.

Dou [53] also proposed user-product deep memory network (UPDMN) for capturing user and product information. Initially, a document is represented using LSTM and then deep memory network having computational layers with content-based attention mechanism is applied for predicting review rating. For handling semantic knowledge in long text, Xu et al. [76] put forth cached LSTM model. Cache mechanism divides the memory in different groups with varying forgetting patterns and enable to capture emotional information locally and globally for improved sentiment classification. Compared to standard LSTM, this model converges faster. Hierarchical attention network based on GRU-based sequence encoder proposed in [55] applies attention mechanism at word- and sentence-level for document-level sentiment classification. It incrementally constructs a document vector by aggregating significant words into sentence vectors and in turn significant sentence vectors into document vectors via aggregation. Song et al. [56] proposed hierarchical iterative attention model using bi-directional LSTM which captures interaction between documents and aspects at word- and sentence-level to learn the document representation in aspect-specific fashion. This model performs multi-aspect sentiment classification. Zhou et al. [57] proposed to use bi-directional LSTM with sentence-level attention mechanism for cross-lingual sentiment classification. Initially, machine translation tool translated training data into target language. They used bi-directional LSTM for modeling the document representation in source and target language. To remove the noise effect introduced due to machine translation, hierarchical attention mechanism is introduced which jointly trains with the LSTM network. Li et al. [58] addressed the issue of selecting the pivots for cross-domain sentiment analysis in transfer learning mode. They used adversarial memory network and jointly trained two networks for sentiment and domain classification. Huang et al. [59] proposed two variants of representations to be used with LSTM for document-level sentiment classification. In the first variant, document is represented by capturing the semantics of sentences from sentence vectors. In the second variant, document is represented using sorted sentence vectors. For getting sorted sentence representation, dataset is pre-processed to remove irrelevant sentences, which does not carry sentiment information.

Sentence-level sentiment analysis approaches

Socher et al. [60] first put forth recursive autoencoder network working in semi-supervised manner for sentiment classification at sentence level. This approach retrieves vector representation with reduced dimensions for multi-word phrases. As this method is based on single-vector space model, it can not capture the compositional meaning of long phrases.

Socher et al. [61] put forth recursive matrix-vector model which additionally associates matrix representation with a word in a tree structure. This approach alleviates the problem of capturing the compositional meaning of long sentences with arbitrary syntax and length by representing the word and phrase using both the vector and matrix. Word vector captures inherent meaning and change in meaning of neighboring words is captured by matrix representation. An external parser has been used for building a tree structure.

To perform supervised training and evaluate sentiment compositional models, Socher et al. [81] developed Stanford Sentiment Treebank dataset [82]. They proposed recursive neural tensor network based on tensor-oriented compositional features for efficiently capturing the interaction among the words in a sentence. The model was tested on movie reviews dataset where sentiment polarities varied from very negative to very positive as five-sentiment classes.

Qian et al. [62] proposed two models based on compositional functions, namely, tag-guided recursive neural network (TG-RNN), tag-embedded recursive neural network/recursive neural tenser network (TE-RNN/RNTN). The former model selects a composition function based on POS tags of a phrase, whereas the later model combines tag and word embeddings. They tested the performance on Sentiment Treebank corpus and the models achieved significant performance over baseline models.

Dynamic CNN proposed by Kalchbrenner et al. [63] uses dynamic K-max pooling operator to capture semantics of sentences. They experimented on DCNN by varying the initialization parameters of word embeddings such as CNN with random initialization, CNN with pre-trained and fine-tuned embeddings, and CNN with multiple sets of word embeddings. Character to sentence CNN model proposed in [64] uses two layers of CNN for extracting word- and sentence-level features with varying length of input sentences for sentiment analysis. Wang et al. [65] utilized gates and constant error carousels in the memory structure of LSTM for handling the interaction among words for via compositional function. A regional CNN-LSTM model [66] performs dimensional sentiment analysis in which regional CNN captures sentence-level information locally and LSTM captures long-distance dependency.

Motivated by structural correspondence learning method preferably used for domain adaptation [83], Yu and Jiang [41] proposed the idea of learning generalized sentence embeddings for cross-domain sentence-level sentiment analysis and designed CNN models to joint learning of hidden feature representations of labeled and unlabeled data.

Aspect-based sentiment analysis approaches

Ruder et al. [67] captured intra- and inter-sentence relation using hierarchical bi-directional LSTM for aspect-based sentiment analysis. The complete reliance on sentence and its structure made their approach language-independent, and thus supports multi-lingual ABSA.

Wang et al. [68] proposed integrated recursive neural networks with conditional random field for jointly extracting the explicit aspect terms and opinion terms as the first step toward ABSA. Xu et al. [69] applied double embedding mechanism with CNN model for aspect extraction. This approach uses both general embeddings (GloVe-CNN) and domain-specific embeddings (DE-CNN) without any extra supervision for aspect extraction.

Attention-over-attention mechanism proposed in [70] jointly models representation of aspects and sentences to capture interaction among aspects and the context of the sentences. It used two bi-directional LSTM networks for learning the hidden semantics of the words in sentence and target. Target-specific transformation networks (TNet) [71] adapts convolutional neural network for handling target-level sentiment classification. For integrating target information into word representation, target-specific transformation network is proposed.

Wang et al. [72] proposed attention-based LSTM for ABSA. They proposed two ways of considering the aspect information while applying attention mechanism. Interactive attention network [73] (IAN) leverages target and context information for computing the attention vector and learns target and context representations. By concatenating target representation with context representation, IAN predicts polarity of the target. Zhang et al. [74] proposed to use gated recurrent neural networks for targeted sentiment analysis. First, for better representation of target and context by applying pooling layer over hidden layer instead of words, bi-directional gated neural network is used. A three-way gated neural network has been used to model interaction between surrounding context and the target. Saeidi et al. [84] proposed SentiHood dataset for targeted ABSA. They proposed to use the bi-directional LSTM model and logistic regression model to learn a classifier for each aspect.

Ma et al. [77] proposed a solution for handling targeted ABSA by applying attention mechanism in two-step model at target- and sentence-level and extending LSTM to incorporate commonsense knowledge associated with sentiments. Inspired by the use of memory augmented models in machine reading, Liu et al. [78] proposed to use external memory chains with a delayed memory update mechanism, enabling to track multiple target entities for targeted ABSA. Sun et al. [79] utilized pre-trained BERT language model for targeted ABSA. Specifically, they represented single sentence and a pair of sentences using pre-trained BERT language model and constructed the auxiliary sentences. After this, the task of targeted ABSA has been transformed into sentence-pair classification task. By fine-tuning the pre-trained BERT model, sentiment analysis has been performed.

5 Evaluation Metrics for Sentiment Analysis

Evaluation metrics commonly used for sentiment analysis have been discussed in this section.

Accuracy: Accuracy (precision) relates to how often the sentiment rating predicted by the model is correct. Higher is the accuracy, better is the model. Accuracy is calculated as
$$ {\text{Acc}}\,{\text{ = }}\,\frac{{{\text{TP + TN}}}}{{{\text{TP + TN + FP + FN}}}} $$
(1)
where TP, TN, FP, and FN denote true positive, true negative, false positive, and false negative, respectively.
F₁ score: It uses both precision and recall of test data for finding its score. It is calculated as follows.
$$ F_{1} = \frac{{2\left( {Precision \times Recall} \right)}}{{Precision + Recall}} $$
(2)
Average recall (AvgRec): For the models, which find the overall sentiment of a document or text, average recall is used. Average recall is calculated by averaging the recall across the sentiment classes such as positive, negative, and neutral.
$$ AvgRec = \frac{1}{2}\left( {R^{P} + R^{N} + R^{U} } \right) $$
(3)
where $ R^{P} $, $ R^{N} $, and $ R^{U} $ refer to recall associated with positive, negative, and neutral class, respectively. The value of AvgRec varies in the range [0, 1]. Average recall is more robust to class imbalance as compared to standard accuracy. Higher the value of AvgRec, better is the model.
Macro-average F₁ score: Macro-average F₁ score is calculated with respect to positive and negative classes as
$$ F_{1}^{PN} = \frac{1}{2}\left( {F_{1}^{P} + F_{1}^{N} } \right) $$
(4)
where $ F_{1}^{P} $ and $ F_{1}^{N} $ denote $ F_{1} $ score with respect to positive and negative class, respectively.
Ranking loss: It averages the distance between actual and predicted rank [85, 86]. It is calculated as follows.
$$ Ranking\,loss = \sum\limits_{i = {1}}^{n} {\frac{{\left| {t_{i} - \hat{t}_{i} } \right|}}{{k \times n}}} $$
(5)
where $ t_{i} $ and $ \hat{t}_{i} $ denote values associated with actual sentiment and predicted sentiment, respectively, k is number of sentiment classes, and n is instances used for testing.
Macro-averaged mean absolute error: It is robust for imbalanced datasets [87]
$$ MAE^{M} \left( {t,\hat{t}} \right) = \frac{{1}}{k}\sum\limits_{{j = {1}}}^{k} {\frac{{1}}{{\left| {t_{j} } \right|}}\sum\limits_{{t_{i} \in t_{j} }} {\left| {t_{i} - \hat{t}_{i} } \right|} } $$
(6)
where t and $ \hat{t} $ denote vector of actual and predicted sentiment values, respectively, $ t_{j} = \left\{ {t_{i} :t_{i} \in t,t_{i} = j} \right\} $ and k denotes sentiment classes in t.
Least absolute error (LAE) [88]: It is widely used evaluation measure to calculate the error of sentiment classification. It is given as
$$ {\text{LAE}} = \sum\limits_{i = 1}^{n} {\left| {\hat{t}_{i} - t_{i} } \right|} $$
(7)
where $ \hat{t}_{i} $ and $ t_{i} $ denote vector of predicted sentiment values and actual sentiment values.
Mean squared error (MSE) [89]: It is used for evaluating the sentiment prediction error. It is specifically used for regression. MSE and Root MSE are computed as follows.
$$ {\text{MSE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {\hat{t}_{i} - t_{i} } \right)^{2} } $$
(8)

$$ {\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {\hat{t}_{i} - t_{i} } \right)^{2} } } $$
(9)
where n denotes number of test instances, $ \hat{t}_{i} $ and $ t_{i} $ denote vector of predicted sentiment values and actual sentiment values. It can be noted that lower values of MSE and RMSE indicates better performance of prediction model.
Pearson correlation coefficient: It is calculated as
$$ r = \frac{1}{n - 1}\sum\limits_{i = 1}^{n} {\left( {\frac{{t_{i} - \bar{t}}}{{\sigma_{t} }}} \right)\left( {\frac{{\hat{t}_{i} - \bar{\hat{t}}}}{{\sigma_{{\hat{t}}} }}} \right)} $$
(10)
where n denotes number of test instances, $ \hat{t}_{i} $ and $ t_{i} $ denote value of predicted and actual sentiments, $ \bar{\hat{t}} $ and $ \bar{t} $ denote arithmetic means of predicted and actual values, and σ represents standard deviation. Higher the value of r indicates better prediction accuracy of the model.
Distributed cumulative grain (DCG): While performing sentiment analysis using topic modeling technique, first topics (aspects) are detected and then the sentiments associated with detected topics (aspects) are predicted. Therefore, for the sake of evaluating the relevance of returned topics (aspects), normalized Discounted Cumulative Gain (nDCG) is used [90]. The regular DCG is computed as follows.
$$ {\text{DCG}}_{m} = \sum\limits_{i = 1}^{m} {\frac{{2^{rel(i)} - 1}}{{\log_{2} (i + 1)}}} $$
(11)
where m represents top m topics (aspects), $ {\text{rel}}(i) $ denotes relevance score of topics (aspect) i. For the models which produce the rankings of the detected topics (aspects), normalized DCG summarizes the quality of the rankings.
KullbackLeibler divergence (KLD): KLD [91] is used for measuring error in estimating actual distribution t over a set $ {\mathbf{\mathcal{K}}} $ of sentiment classes by means of a predicted distribution $ \hat{t} $. Like $ MAE^{M} $, lower the values of KLD, better is the model. KLS is calculated as follows.
$$ KLD\left( {\hat{t}, t,{K}} \right) = \sum\limits_{{k_{j} \in {K}}} {t\left( {k_{j} } \right){log}_{e} \frac{{t\left( {k_{j} } \right)}}{{\hat{t}(k_{j} )}}} $$
(12)
Area under the ROC curve (AUC): Saeidi et al. [84] proposed to use the AUC metric for tasks of aspect and sentiment detection. AUC helps to measure the quality of ranking the output scores without relying on the threshold.

6 Benchmarked Datasets and Tools

Table 4 gives the glimpse of standard benchmarked datasets used for sentiment analysis at document, sentence, aspect, and targeted aspect-level.

Table 4 Benchmarked datasets for sentiment analysis

Full size table

These are numerous tools available which offer sentiment analysis as one of its services. The details of the tools providing sentiment analysis as a service have been mentioned in Table 5.

Table 5 Comparative study of existing tools for sentiment analysis

Full size table

With reference to popularity of sentiment analysis, dedicated search engines have been developed such as Social Mention [116], Social Searcher [117], Talkwalker’s Quick Search [118]. Social Mention [116] combines the user-generated data across the Web and gives the sentiments of a given keyword based on how many times the positive, negative, and neutral mentions of the keyword are present in the collected data. Social Searcher [117] is a real-time search engine for quickly pulling recent mentions from popular social networks and displays analytics in the form of mentions, users, and sentiments for the topic entered in the search box. It also offers sentiment filters to get a set of mentions.

7 Conclusion

This chapter gives a demystified overview of state-of-the-art approaches for sentiment analysis. The proposed graphical taxonomy gives traits to be considered for designing the sentiment analysis systems. Providing suitable input to the deep learning models plays crucial role in achieving the good performance. Therefore, parameters associated with text representation techniques such as use of embedded vectors, language models, ways of improving the functionality of embedded vectors, and approximating the computationally expensive softmax function in embedding models have been thoroughly discussed.

A comparative overview of the noteworthy research papers focusing on sentiment analysis at document, sentence, and aspect level using deep learning approaches has been given in the chapter. We also shed light upon state-of-the-art benchmarked datasets and the tools and services available for sentiment analysis.

References

Pathak, A.R., M. Pandey, and S. Rautaray. 2018. Construing the big data based on taxonomy, analytics and approaches. Iran Journal of Computer Science 1: 237–259.
Article Google Scholar
NLP market. https://www.tractica.com/newsroom/press-releases/natural-language-processing-market-to-reach-22–3-billion-by-2025/.
Agarwal, B., and N. Mittal. 2016. Prominent feature extraction for sentiment analysis. In Springer Book Series: Socio-Affective Computing series, 1–115. Springer International Publishing, ISBN: 978-3-319-25343-5. https://doi.org/10.1007/978-3-319-25343-5.
Book Google Scholar
Pathak, A.R., M. Pandey, and S. Rautaray. 2018. Application of deep learning for object detection. Procedia Computer Science 132: 1706–1717.
Article Google Scholar
Pathak, A.R., M. Pandey, and S. Rautaray. 2018. Deep learning approaches for detecting objects from images: A review. In Progress in Computing, Analytics and Networking, ed. Pattnaik, P. K., S.S. Rautaray, H. Das, J. Nayak, J., 491–499. Springer Singapore.
Google Scholar
Pathak, A.R., M. Pandey, S. Rautaray, and K. Pawar. 2018. Assessment of object detection using deep convolutional neural networks. In Advances in Intelligent Systems and Computing, 673.
Google Scholar
Pawar, K., and V. Attar. 2019. Deep learning approaches for video-based anomalous activity detection. World Wide Web 22: 571–601.
Article Google Scholar
Pawar, K., and V. Attar. 2019. Deep Learning approach for detection of anomalous activities from surveillance videos. In CCIS. Springer, In Press.
Google Scholar
Pathak, A.R., M. Pandey, and S. Rautaray. 2019. Adaptive model for dynamic and temporal topic modeling from big data using deep learning architecture. Internationl Journal of Intelligent Systems and Applications 11: 13–27. https://doi.org/10.5815/ijisa.2019.06.02.
Article Google Scholar
Bhat, M.R., M.A. Kundroo, T.A. Tarray, and B. Agarwal. 2019. Deep LDA: A new way to topic model. Journal of Information and Optimization Sciences 1–12 (2019).
Google Scholar
Pathak, A.R., M. Pandey, and S. Rautaray. 2019. Adaptive framework for deep learning based dynamic and temporal topic modeling from big data. Recent Patents on Engineering, Bentham Science 13: 1. https://doi.org/10.2174/1872212113666190329234812.
Pathak, A.R., M. Pandey, and S. Rautaray. 2019. Empirical evaluation of deep learning models for sentiment analysis. Journal of Statistics and Management Systems 22: 741–752.
Article Google Scholar
Pathak, A.R., M. Pandey, and S. Rautaray. 2019. Adaptive model for sentiment analysis of social media data using deep learning. In International Conference on Intelligent Computing and Communication Technologies, 416–423.
Google Scholar
Ram, S., S. Gupta, and B. Agarwal. 2018. Devanagri character recognition model using deep convolution neural network. Journal of Statistics and Management Systems, 21: 593–599.
Article Google Scholar
Hinton, G., and et al. 2012. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29.
Google Scholar
Jain, G., M. Sharma, and B. Agarwal. 2019. Spam detection in social media using convolutional and long short term memory neural network. Annals of Mathematics and Artificial Intelligence 85: 21–44.
Article Google Scholar
Agarwal, B., H. Ramampiaro, H. Langseth, and M. Ruocco. 2018. A deep network model for paraphrase detection in short text messages. Information Processing & Management 54: 922–937.
Article Google Scholar
Jain, G., M. Sharma, and B. Agarwal. 2019. Optimizing semantic LSTM for spam detection. International Journal of Information Technology 11: 239–250.
Article Google Scholar
Liu, B. 2012. Sentiment Analysis and Opinion Mining, 1–108. https://doi.org/10.2200/s00416ed1v01y201204hlt016.
Article Google Scholar
SemEval-2014. http://alt.qcri.org/semeval2014/task4/.
Glorot, X., A. Bordes, and Y. Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), 513–520.
Google Scholar
Bengio, Y., R. Ducharme, P. Vincent, and C. Jauvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research 3: 1137–1155.
MATH Google Scholar
Collobert, R., et al. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12: 2493–2537.
MATH Google Scholar
Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. arXiv Prepr. arXiv1301.3781.
Google Scholar
Bojanowski, P., E. Grave, A. Joulin, and T. Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5: 135–146.
Article Google Scholar
Pennington, J., R. Socher, and C. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543.
Google Scholar
Peters, M.E., and et al. 2018. Deep contextualized word representations. In Proceedings of NAACL.
Google Scholar
Tang, D., and et al. 2014. Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1555–1565.
Google Scholar
Yang, Z., and et al. 2018. Glomo: Unsupervisedly learned relational graphs as transferable representations. arXiv Prepr. arXiv1806.05662.
Google Scholar
Howard, J., and S. Ruder. Universal language model fine-tuning for text classification. arXiv Prepr. arXiv1801.06146.
Google Scholar
Radford, A., K. Narasimhan, T. Salimans, I. Sutskever. 2018. Improving language understanding by generative pre-training. URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv Prepr. arXiv1810.04805.
Google Scholar
Liu, P., S. Joty, and H. Meng. 2015. Fine-grained opinion mining with recurrent neural networks and word embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1433–1443.
Google Scholar
Zhai, S., and Z.M. Zhang. 2016. Semisupervised autoencoder for sentiment analysis. In Thirtieth AAAI Conference on Artificial Intelligence.
Google Scholar
EMLo. https://allennlp.org/elmo.
Zhu, Y., and et al. 2015. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE International Conference on Computer Vision, 19–27.
Google Scholar
Poria, S., E. Cambria, and A. Gelbukh. 2016. Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based Systems 108: 42–49.
Article Google Scholar
Wang, W., S.J. Pan, D. Dahlmeier, and X. Xiao. 2016. Recursive neural conditional random fields for aspect-based sentiment analysis. arXiv Prepr. arXiv1603.06679.
Google Scholar
Jebbara, S., and P. Cimiano. 2016. Aspect-based relational sentiment analysis using a stacked neural network architecture. In Proceedings of the Twenty-second European Conference on Artificial Intelligence, 1123–1131.
Google Scholar
Tang, D., B. Qin, and T. Liu. 2015. Learning semantic representations of users and products for document level sentiment classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1014–1023.
Google Scholar
Yu, J., and J. Jiang. 2016. Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 236–246.
Google Scholar
Sarma, P.K., Y. Liang, and W.A. Sethares. 2018. Domain adapted word embeddings for improved sentiment classification. arXiv Prepr. arXiv1805.04576.
Google Scholar
Bengio, Y., J.-S. Senécal, and Others. 2003. Quick training of probabilistic neural nets by importance sampling. In AISTATS, 1–9.
Google Scholar
Bengio, Y., and J.-S. Senécal. 2008. Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Transactions on Neural Networks 19: 713–722.
Article Google Scholar
Jean, S., K. Cho, R. Memisevic, and Y. Bengio. 2014. On using very large target vocabulary for neural machine translation. arXiv Prepr. arXiv1412.2007.
Google Scholar
Mnih, A., and Y.W. Teh. 2012. A fast and simple algorithm for training neural probabilistic language models. arXiv Prepr. arXiv1206.6426.
Google Scholar
Morin, F., and Y. Bengio. 2005. Hierarchical probabilistic neural network language model. Aistats 5: 246–252.
Google Scholar
Chen, W., D. Grangier, and M. Auli. 2015. Strategies for training large vocabulary neural language models. arXiv Prepr. arXiv1512.04906.
Google Scholar
Kim, Y., Y. Jernite, D. Sontag, and A.M. Rush. 2016. Character-aware neural language models. In Thirtieth AAAI Conference on Artificial Intelligence.
Google Scholar
Jozefowicz, R., O. Vinyals, M. Schuster, N. Shazeer, and Y. Wu. 2016. Exploring the limits of language modeling. arXiv Prepr. arXiv1602.02410.
Google Scholar
Tang, D., B. Qin, and T. Liu. 2015. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1422–1432.
Google Scholar
Zhou, H., L. Chen, F. Shi, and D. Huang. 2015. Learning bilingual sentiment word embeddings for cross-language sentiment classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 430–440.
Google Scholar
Dou, Z.-Y. 2017. Capturing user and product information for document level sentiment analysis with deep memory network. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 521–526.
Google Scholar
Le, Q., and T. Mikolov. 2014. Distributed representations of sentences and documents. In International Conference on Machine Learning, 1188–1196.
Google Scholar
Yang, Z., and et al. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–1489.
Google Scholar
Yin, Y., Y. Song, and M. Zhang. 2017. Document-level multi-aspect sentiment classification as machine comprehension. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2044–2054.
Google Scholar
Zhou, X., X. Wan, and J. Xiao. 2016. Attention-based LSTM network for cross-lingual sentiment classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 247–256.
Google Scholar
Li, Z., Y. Zhang, Y. Wei, Y. Wu, and Q. Yang. 2017. End-to-end adversarial memory network for cross-domain sentiment classification. In IJCAI, 2237–2243.
Google Scholar
Rao, G., W. Huang, Z. Feng, and Q. Cong. 2018. LSTM with sentence representations for document-level sentiment classification. Neurocomputing 308: 49–57.
Article Google Scholar
Socher, R., J. Pennington, E.H. Huang, A.Y. Ng, and C.D. Manning. semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 151–161.
Google Scholar
Socher, R., B. Huval, C.D. Manning, and A.Y. Ng. 2012. Semantic compositionality through recursive matrix-vector spaces. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 1201–1211.
Google Scholar
Qian, Q., and et al. 2015. Learning tag embeddings and tag-specific composition functions in recursive neural network. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1365–1374.
Google Scholar
Kalchbrenner, N., E. Grefenstette, and P. Blunsom. 2014. A convolutional neural network for modelling sentences. arXiv Prepr. arXiv1404.2188.
Google Scholar
dos Santos, C., and M. Gatti. 2014. Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 69–78.
Google Scholar
Wang, X., Y. Liu, S.U.N. Chengjie, B. Wang, and X. Wang. 2015. Predicting polarities of tweets by composing word embeddings with long short-term memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, 1343–1353.
Google Scholar
Wang, J., L.-C. Yu, K. Lai, and X. Zhang. 2016. Dimensional sentiment analysis using a regional CNN-LSTM model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 225–230.
Google Scholar
Ruder, S., P. Ghaffari, and J.G. Breslin. 2016. A hierarchical model of reviews for aspect-based sentiment analysis. arXiv Prepr. arXiv1609.02745.
Google Scholar
Wang, W., S.J. Pan, and D. Dahlmeier, and X. Xiao. 2016. Recursive neural conditional random fields for aspect-based sentiment analysis. arXiv Prepr. arXiv1603.06679.
Google Scholar
Xu, H., B. Liu, L. Shu, and P.S. Yu. 2018. Double embeddings and cnn-based sequence labeling for aspect extraction. arXiv Prepr. arXiv1805.04601.
Google Scholar
Huang, B., Y. Ou, and K.M. Carley. 2018. Aspect level sentiment classification with attention-over-attention neural networks. In International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, 197–206.
Google Scholar
Li, X., L. Bing, W. Lam, and B. Shi. 2018. Transformation networks for target-oriented sentiment classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 946–956.
Google Scholar
Wang, Y., M. Huang, L. Zhao and Others. 2016. Attention-based lstm for aspect-level sentiment classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 606–615.
Google Scholar
Ma, D., S. Li, X. Zhang, and H. Wang. 2017. Interactive attention networks for aspect-level sentiment classification. arXiv Prepr. arXiv1709.00893.
Google Scholar
Zhang, M., Y. Zhang, and D.-T. Vo. 2016. Gated neural networks for targeted sentiment analysis. In Thirtieth AAAI Conference on Artificial Intelligence.
Google Scholar
Mitchell et al. Corpus. http://www.m-mitchell.com/code/index.html.
Xu, J., D. Chen, X. Qiu, and X. Huang. 2016. Cached long short-term memory neural networks for document-level sentiment classification. arXiv Prepr. arXiv1610.04989.
Google Scholar
Ma, Y., H. Peng, and E. Cambria. 2018. Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In Thirty-Second AAAI Conference on Artificial Intelligence.
Google Scholar
Liu, F., T. Cohn, and T. Baldwin. 2018. Recurrent entity networks with delayed memory update for targeted aspect-based sentiment analysis. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 278–283.
Google Scholar
Sun, C., L. Huang, and X. Qiu. 2019. Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 380–385.
Google Scholar
Pelletier, F.J. 1994. The principle of semantic compositionality. Topoi 13: 11–24.
Article MathSciNet Google Scholar
Socher, R., and et al. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 1631–1642.
Google Scholar
Sentiment Treebank. https://nlp.stanford.edu/sentiment/treebank.html.
Blitzer, J., M. Dredze, and F. Pereira. 2007. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 440–447.
Google Scholar
Saeidi, M., G. Bouchard, M. Liakata, and S. Riedel. 2016. SentiHood: targeted aspect based sentiment analysis dataset for urban neighbourhoods. In Proceeding COLING 2016, 26th International Conference Computational Linguistics: Technical Papers, 1546–1556.
Google Scholar
Crammer, K., and Y. Singer. 2002. Pranking with ranking. In Advances in Neural Information Processing Systems, 641–647.
Google Scholar
Moghaddam, S., and M. Ester. 2010. Opinion digger: an unsupervised opinion miner from unstructured product reviews. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, 1825–1828.
Google Scholar
Marcheggiani, D., O. Täckström, A. Esuli, and F. Sebastiani. 2014. Hierarchical multi-label conditional random fields for aspect-oriented opinion mining. In European Conference on Information Retrieval, 273–285.
Google Scholar
Lu, B., M. Ott, C. Cardie, and B.K. Tsou. 2011. Multi-aspect sentiment analysis with topic models. In 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), 81–88.
Google Scholar
Wang, H., Y. Lu, and C. Zhai. 2011. Latent aspect rating analysis without aspect keyword supervision. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 618–626.
Google Scholar
Wang, Q., J. Xu, H. Li, and N. Craswell. 2013. Regularized latent semantic indexing: A new approach to large-scale topic modeling. ACM Transactions on Information Systems (TOIS) 31: 5.
Article Google Scholar
Kullback, S., and R.A. Leibler. 1951. On information and sufficiency. The Annals of Mathematical Statistics 22: 79–86.
Article MathSciNet Google Scholar
Yelp Dataset. https://www.yelp.com/dataset/challenge.
Diao, Q. and et al. 2014. Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 193–202.
Google Scholar
Zhang, X., J. Zhao and Y. LeCun. 2015. Character-level convolutional networks for text classification. In Advances in Neural Information Processing Systems, 649–657.
Google Scholar
NLP and CC 2013. http://tcci.ccf.org.cn/conference/2013/index.html.
Movie Reviews. http://www.cs.cornell.edu/people/pabo/movie-review-data/.
MPQA Opinion. http://mpqa.cs.pitt.edu.
Go, A., R. Bhayani, and L. Huang. 2009. Twitter sentiment classification using distant supervision. In CS224 N Project Report, Stanford, 1.
Google Scholar
Yu, L.-C., and et al. 2016. Building Chinese affective resources in valence-arousal dimensions. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 540–545.
Google Scholar
Camera Review. https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html.
Pontiki, M., and et al. 2016. SemEval-2016 task 5: Aspect based sentiment analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), 19–30.
Google Scholar
Dong, L., and et al. 2014. Adaptive recursive neural network for target-dependent twitter sentiment classification. In Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 2: Short papers), 49–54.
Google Scholar
Brand24. https://brand24.com.
Clarabridge. https://www.clarabridge.com/platform/analytics/.
Repustate. https://www.repustate.com.
OpenText. https://www.opentext.com/products-and-solutions/products/discovery/information-access-platform/sentiment-analysis.
ParallelDots. https://www.paralleldots.com/sentiment-analysis.
Lexalytics. https://www.lexalytics.com/technology/sentiment-analysis.
Hi-Tech BPO. https://www.hitechbpo.com/sentiment-analysis.php.
Sentiment Analyzer. https://www.danielsoper.com/sentimentanalysis/.
SentiStrength. http://sentistrength.wlv.ac.uk.
Meaning Cloud. https://www.meaningcloud.com/products/sentiment-analysis.
Tweet Sentiment Visualization. https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/tweet_app/.
Rapidminer. https://rapidminer.com/solutions/text-mining/.
Brandwatch. https://www.brandwatch.com/products/analytics/.
Social Mention. http://www.socialmention.com.
Social Searcher. https://www.social-searcher.com/social-buzz/.
Talkwalker’s Quick Search. https://www.talkwalker.com/quick-search-form.
Sentigem. https://sentigem.com.

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Kalinga Institute of Industrial Technology University (KIIT), Bhubaneswar, 751024, India
Ajeet Ram Pathak, Manjusha Pandey & Siddharth Rautaray
Department of Computer Science and Engineering, Indian Institute of Information Technology (IIIT), Kota, 302017, India
Basant Agarwal

Authors

Ajeet Ram Pathak
View author publications
You can also search for this author in PubMed Google Scholar
Basant Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Manjusha Pandey
View author publications
You can also search for this author in PubMed Google Scholar
Siddharth Rautaray
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Indian Institute of Information Technology Kota (IIIT-Kota), Jaipur, Rajasthan, India
Basant Agarwal
Faculty of Science and Engineering, School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, QLD, Australia
Richi Nayak
Department of Computer Science and Engineering, Malaviya National Institute of Technology, Jaipur, Rajasthan, India
Namita Mittal
Department of Computer Science and Engineering, SOA University, Bhubaneswar, Odisha, India
Srikanta Patnaik

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pathak, A.R., Agarwal, B., Pandey, M., Rautaray, S. (2020). Application of Deep Learning Approaches for Sentiment Analysis. In: Agarwal, B., Nayak, R., Mittal, N., Patnaik, S. (eds) Deep Learning-Based Approaches for Sentiment Analysis. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-1216-2_1

Download citation

DOI: https://doi.org/10.1007/978-981-15-1216-2_1
Published: 25 January 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1215-5
Online ISBN: 978-981-15-1216-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Application of Deep Learning Approaches for Sentiment Analysis

Abstract