Position-aware Hybrid Attention Network for Aspect-Level Sentiment Analysis

Zheng, Yongqiang; Li, Xia; Su, Guixin; Ma, Junteng; Ning, Chaolin

doi:10.1007/978-3-030-56725-5_7

Yongqiang Zheng¹³,
Xia Li^13,14,
Guixin Su¹³,
Junteng Ma¹³ &
…
Chaolin Ning¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12285))

Included in the following conference series:

China Conference on Information Retrieval

662 Accesses
2 Citations

Abstract

Aspect-level sentiment analysis aims to predict the sentiment polarity of a given target in a review sentence. Most of the previous methods focus on capturing the context information of words across the sentence related to the target, ignoring the importance of the independent relationship between the opinion words and the target. To address this limitation, we propose a position-aware hybrid attention network model for aspect-level sentiment analysis, which incorporates not only the context information of words related to the target, but also the independent relationship between the opinion words related to the target. We conduct several comparable experiments on public laptop and restaurant datasets. The experimental results show that our proposed model achieves a more effective performance than the baseline models.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Target-Based Attention Model for Aspect-Level Sentiment Analysis

Effective Self Attention Modeling for Aspect Based Sentiment Analysis

Aspect-location attention networks for aspect-category sentiment analysis in social media

Article 14 December 2022

Keywords

1 Introduction

Aspect-level sentiment analysis is a fine-grained task in sentiment analysis, which aims to predict sentiment polarity (i.e., positive, neutral, or negative) of a specific target of a given sentence. Aspect-level sentiment analysis can be used in many fields such as product review analysis, public opinion analysis, and stock opinion analysis etc.

A core challenge of aspect-level fine-grained sentiment analysis is to correctly find the corresponding sentiment polarity of a given aspect in a sentence which contains more than one aspect with different polarities. For example, given a sentence “Service was slow, but the people were friendly.”, “service” and “people” are two targets of the sentence and each of them related to different opinion words “slow” and “friendly”. It means that as for the “service” target, the polarity of sentiment is negative, and as for the “people” target, the polarity of sentiment is positive. Therefore, finding the relationship between the target and corresponding opinion words is important for getting the final sentiment of a target in the sentence.

In the previous studies, various solutions are proposed to capture the context information of the given target. One solution is to use the position of opinion words in the sentence to obtain more precise relationship between opinion words and the target. For example, Zeng et al. [1] introduced the position information of words to help capture the relationship between opinion words and the target. Another solution is to model context information (not limited to the opinion words) related to the target in the sentence. For example, Tang et al. [2] used two LSTM networks to model left context information and right context information related to the target, respectively. Wang et al. [3] combined each word hidden state with aspect embedding as context information supplementation to the target. These methods achieve good performance in the task of aspect-level sentiment analysis task based on context information related to the target words in the sentence or the word location information.

However, we find that the context information captured by the above models is those words across the sentence (e.g., left context and right context). We argue that the opinion words are more important in supervising the polarity of the sentence for the given target, that is to say, we can independently consider the importance of the relationship between the target and opinion words.

To this end, we proposed a position-aware hybrid attention network based model which consists of two components, namely opinion attention network and context attention network. The context attention network is used to capture context information between words across sentence with the target, and the opinion attention network is used to incorporate independent relationship between opinion words and the target. The proposed model shows a stable improvement results in laptop and restaurant data sets. Based on our work, the main contributions are as follows:

(1)
We propose a hybrid attention network to capture the context information between the words across sentence with the target, as well as the independent relationship between the opinion words and the target to obtain more precisely sentiment information of the given target in the sentence.
(2)
We conduct several experiments and ablation tests on public laptop and restaurant datasets to validate our model. We will show that our model achieves a stable and effective performance compared with the baseline models.

2 Related Work

Aspect-level sentiment analysis aims to detect polarity of a sentence for a given target in a sentence. Many of the previous studies rely on rich features, such as sentiment lexicons, linguistic features and syntax etc., to help detect the sentiment. Kiritchenko et al. [4] built two sentiment lexicons for restaurant and laptop domain, and achieved good results in detecting aspects and sentiment by using these lexicons. Wagner et al. [5] combined four sentiment lexicons to design some rule-based features and extracted Bag-of-N-gram features to train a classifier for aspect-level sentiment analysis. Vo et al. [6] split a tweet into a left context and a right context according to a given target, using distributed word representations and neural pooling functions to extract features.

In recent years, different models based on neural networks are proposed and achieve good results in aspect-level sentiment analysis task due to their strong capacity to automatically extract high-level features of sentences [7,8,9,10,11,12,13,14,15,16,17,18,19,20].

As the context information of a given target is useful for improving the performance of aspect-level sentiment analysis task, some of previous studies focused on modeling context information related to the target. For example, Tang et al. [2] used two LSTM networks to model left context information and right context information related to the target words, respectively. The left and right target-dependent representations are concatenated together as the final representation of the sentence to predict the sentiment polarity of the aspect. Wang et al. [3] combined hidden states of each word with aspect embedding as context information supplementation to supervise the generation of attention vectors, and used the attention vectors to generate the final representation of the sentence. Tang et al. [7] captured the correlation between each context word and the aspect through multiple attentions and used the output of the last attention as the final representation of the sentence. Different from the above models, Ma et al. [8] used two independent LSTM networks to model aspects and contexts respectively, and used the attentive representation of aspect for the context, the concatenation of the hidden states of the two LSTM networks as the final representation of the sentence. Chen et al. [9] proposed a multi-layer architecture, in which each layer includes attention-based word feature aggregation and a GRU unit is proposed to learn the sentence representation.

Some of recent studies paid more attentions to word location information and achieved a new good results [1, 21, 22]. Zeng et al. [1] used Gaussian kernel to model the position of words. By introducing the position information into the model, their methods improved the results of the aspect-level sentiment analysis task. Wang et al. [21] introduced the ideas of global attention scores and grammar-based local attention scores for the task, a gating mechanism was used to synthesize global information and local information to generate the final representation of the sentence.

In this paper, we also focus on capturing the context information related to the given target of the sentence. We propose a hybrid attention network based model to incorporate independent relationship between opinion words and target, as well as the context information between the words across sentence with the target.

3 The Proposed Model

In this paper, a position-aware hybrid attention network is proposed for the aspect-level sentiment analysis. As shown in Fig. 1, our model mainly includes four parts: embedding layer, encoder layer, attention layer, and output layer, where the hybrid attention layer is divided into opinion attention module and context attention module.

For the context attention module, as previous work, we use aspect representation to help calculate the attention of each word across the sentence related to the target. For the opinion attention module, we use the aspect representation to help calculate the attention score of the candidate opinion words, and generate the opinion feature representation with different weights. We input the context information representation getting from the whole sentence and the opinion relationship representation getting from only the independent opinion words into a fully connected layer to get the final representation of the sentence. In our model, similar to previous work, we also introduced position embedding to be concatenated with word embedding to better obtain the position information of the words related to the target.

In the following sections, we will describe our model in more detail. Section 3.1 gives the problem definition, Sect. 3.2 introduces word position embedding, Sect. 3.3 introduces the encoding layer for sentence, target and opinion words, Sect. 3.4 introduces the hybrid attention networks and Sect. 3.5 describes the loss function of our model.

3.1 Problem Definition

Given a sentence with n number of word sequences $ S = \left\{ {w_{1} ,w_{2} ,w_{3} , \ldots ,w_{n} } \right\} $, a target with k number of word sequences $ A = \left\{ {w_{1}^{a} ,w_{2}^{a} ,w_{3}^{a} , \ldots ,w_{k}^{a} } \right\} $, where $ A $ is a subset of $ S $. The purpose of aspect-level sentiment analysis is to find out the sentiment of the given target A in a context sentence S.

As said in the above section, we argue that the opinion words are also important in supervising the polarity of the sentence for the given target. In our model, we use sentiment lexicon^{Footnote 1} of Bin Liu [23] to extract the opinion words of the sentence. Given the sentence S, we can also get opinion words $ O = \left\{ {w_{1}^{o} ,w_{2}^{o} ,w_{3}^{o} , \ldots ,w_{m}^{o} } \right\} $, where $ O $ is a subset of $ S $. Then the final definition of our model is described as finding out the sentiment of the given target A in a context sentence S with extracted opinion words $ O $.

3.2 Word Position Embedding

Let $ E \in {\mathbb{R}}^{{d_{e} \times \left| V \right|}} $ be the pre-trained word embedding matrix generated by the unsupervised method [24, 25], $ P \in {\mathbb{R}}^{{d_{p} \times \left| N \right|}} $ be the position embedding matrix. Where $ d_{e} $ is the dimension of word embedding, $ \left| V \right| $ is the vocabulary, $ d_{p} $ is the dimension of position embedding, and $ \left| N \right| $ is the number of possible relevant positions between each word and aspect.

We define the relative distance between each word and the target as the relative offset of the word across the sentence to the target. We calculate the distance using the formula (1). Where i is the index of the each word across the sentence, j is the index of the first word in the target, k is the length of the target, and n is the length of the whole sentence.

$$ \left\{ \begin{aligned} & \;\;\,\,i - j\quad \quad \quad \quad \;\;\,i < j \\ & i - j - k\quad \quad j + k < i \le n \\ & 0\quad \quad \quad \quad \;\;j \le i \le j + k \\ \end{aligned} \right. $$

(1)

In the pre-training word embedding matrix, find each word in sentence S, opinion word O, and target A, we map them into $ d_{e} $ vectors. In the position embedding matrix, find each sentence in word S, opinion word O, we map them into $ d_{p} $ vectors. Finally, the word embedding and position embedding are concatenated together. In target A, there is no position embedding, and no concatenating is needed:

$$ x_{i} = \left[ {E\left( {w_{i} } \right);P\left( {w_{i} } \right)} \right] $$

(2)

$$ x_{i}^{o} = \left[ {E\left( {w_{i}^{o} } \right);P\left( {w_{i}^{o} } \right)} \right] $$

(3)

$$ x_{i}^{a} = E\left( {w_{i}^{a} } \right) $$

(4)

Where $ w_{i} ,w_{i}^{o} ,w_{i}^{a} $ represent word sequences S, opinion word O, aspect A respectively. E (w) means search in word embedding matrix, P (w) means search in position embedding matrix, [;] represents vector stitching.

3.3 Sentence, Target and Opinion Words Encoding

In our model, we use three bidirectional long short term memory (Bi-LSTM) networks [26] to encode contextual information, opinion information and aspect information respectively. For forward LSTM, we fed word embedding $ x_{i} $ and the hidden state at last time step $ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {h}_{t - 1} $ and the hidden state $ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {h}_{t} $ can be calculated as:

(5)

Backward LSTM does the same thing as forward LSTM except that the input sequence is fed in a reversed way. Then the hidden state of forward LSTM and backward LSTM are concatenated and hyperbolic tangent activation function is applied to the concatenation result to form the hidden state $ h_{i} $:

(6)

$$ h_{i} = { \tanh }\left( {\left[ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {h}_{i} ; \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\leftharpoonup}$}} {h}_{i} } \right]} \right) $$

(7)

where $ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {h}_{i} $ and $ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\leftharpoonup}$}} {h}_{i} $ are the hidden state of forward LSTM and backward LSTM at time step i respectively. The output of the Encoder layer are denoted as $ H = \left\{ {h_{1} , h_{2} , h_{3} , h_{n} } \right\} $, $ H_{o} = \left\{ {h_{1}^{o} , h_{2}^{o} , h_{3}^{o} , h_{m}^{o} } \right\} $, $ H_{a} = \left\{ {h_{1}^{a} , h_{2}^{a} , h_{3}^{a} , h_{m}^{a} } \right\} $.

3.4 Hybrid Attention Network

As shown in Fig. 1, we design two attention modules, Opinion Attention and Context Attention, to incorporate independent relationship information of opinion words related to the target and context information of words related to the target across sentence. In the two modules, Opinion Attention aims to generate precisely opinion representation by learning the relationship between opinion words and the target, and Context Attention makes the model focus on the words across sentence related to the target.

Opinion Attention.

Opinion Attention is designed to get the independent relationship of different opinion words and the target. As shown in Fig. 2, our opinion words extraction strategy is as follow: we first combine the positive and negative sentiment lexicons as a whole sentiment lexicon. Based on the combined lexicon, given a sentence S, we can get the candidate opinion words O. In order to determine the corresponding opinion words of each aspect, we also use dependency syntax analysis. The words that are dependent on the aspect are called “direct reach”, and the distance between these words and the aspect is 1. We test different candidate opinion words extraction strategies with different distance and the results will be discussed later.

Given the target $ H_{a} = \left\{ {h_{1}^{a} ,h_{2}^{a} , \ldots ,h_{k}^{a} } \right\} $, and the hidden state of opinion words $ H_{o} = \left\{ {h_{1}^{o} ,h_{2}^{o} , \ldots ,h_{m}^{o} } \right\} $, the Opinion Attention score α can be calculated by the following formulas (8–10). First, we get the average pooling of target representation $ h_{a\_avg} $. We use the aspect representation to learn the attention score of each word across sentence related to the target $ \alpha_{i} $, where $ W_{att1} \in {\mathbb{R}}^{{2d_{l} }} $ is the weight matrix.

$$ h_{a\_avg} = \frac{1}{k}\mathop \sum \limits_{i = 1}^{k} h_{i}^{a} $$

(8)

$$ f_{o} \left( {h_{i}^{o} ,h_{a\_avg} } \right) = h_{i}^{o} W_{att1} h_{a\_avg}^{{\rm T}} $$

(9)

$$ \alpha_{i} = \frac{{\exp \left( {f_{o} \left( {h_{i}^{o} ,h_{a\_avg} } \right)} \right)}}{{\mathop \sum \nolimits_{j = 1}^{n} \exp \left( {f_{o} \left( {h_{j}^{o} ,h_{a\_avg} } \right)} \right)}} $$

(10)

Then, the relationship representation $ r_{o} \in {\mathbb{R}}^{{2d_{l} }} $ is expressed as a weighted sum of the hidden state $ h_{i}^{o} $ and its attention score $ \alpha_{i} $ as shown in formula (11):

$$ r_{o} = \mathop \sum \limits_{i = 1}^{n} h_{i}^{o} \alpha_{i} $$

(11)

Context Attention.

Given the target representation and the hidden state of each words across sentence $ H = \left\{ {h_{1} ,h_{2} , \ldots ,h_{n} } \right\} $, the contextual attention score β can be calculated by the following formula (12–13), where, $ W_{att2} \in {\mathbb{R}}^{{2d_{l} }} $ is the weight matrix.

$$ f_{c} \left( {h_{i} ,h_{a\_avg} } \right) = h_{i} W_{att2} h_{a\_avg}^{{\rm T}} $$

(12)

$$ \beta_{i} = \frac{{\exp \left( {f_{c} \left( {h_{i} ,h_{a\_avg} } \right)} \right)}}{{\mathop \sum \nolimits_{j = 1}^{n} \exp \left( {f_{c} \left( {h_{j} ,h_{a\_avg} } \right)} \right)}} $$

(13)

Then, the context information $ r_{c} \in {\mathbb{R}}^{{2d_{l} }} $ is expressed as a weighted sum of the hidden state $ h_{i} $ and its attention score $ \beta_{i} $ as shown in formula (14).

$$ r_{c} = \mathop \sum \limits_{i = 1}^{n} h_{i} \beta_{i} $$

(14)

From the above attention model, the relationship representation $ r_{o} $ and the context information $ r_{c} $ are obtained. Then we use a non-linear layer to project a particular aspect of the attention representation r into the class C target space, as shown in formula (15).

$$ r = \tanh \left( {W_{o} r_{o} + W_{c} r_{c} } \right) $$

(15)

Where, $ W_{o} \in {\mathbb{R}}^{{2d_{l} \times C}} $ and $ W_{c} \in {\mathbb{R}}^{{2d_{l} \times C}} $ are weight matrices, and C is the number of emotional polarities. Then we use softmax to calculate the sentiment distribution of r as formula (16).

$$ y_{i} = \frac{{\exp \left( {r_{i} } \right)}}{{\mathop \sum \nolimits_{i = 1}^{C} \exp \left( {r_{i} } \right)}} $$

(16)

3.5 Loss Function

Let $ \hat{y} $ be the estimated probability distribution and y be the true distribution. We use cross entropy and L2 regularization for the parameters as the loss function, as shown in formula (17). Where i is the index of sentence, j is the index of class. N is the number of training samples, $ C $ is the number of sentiment classes, $ \lambda $ is the L2-regularization term. $ \Theta $ is the parameter set.

$$ J = - \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = 1}^{C} y_{i}^{j} \log \left( {\widehat{{y_{i}^{j} }}} \right) + \lambda \left( {\mathop \sum \limits_{{\theta \in\Theta }} \theta^{2} } \right) $$

(17)

4 Experiments

4.1 Dataset

We conducted several experiments on the data set of SemEval 2014^{Footnote 2} task 4 to verify the effectiveness of our model. The SemEval 2014 dataset includes comments in two areas, which are notebooks and restaurants. These comments have three emotional polarities: positive, neutral, and negative, as shown in Table 1. In addition, followed previous work, we use the accuracy as the evaluation index of the model.

Table 1. The details of the laptop and restaurant datasets.

Full size table

4.2 Experiment Settings

In our experiment, word embeddings are all initialized using a pre-trained 300 dimensional GloVe^{Footnote 3} word vector [24]. All words outside the vocabulary are initialized by sampling from the uniformly distributed from (−0.1, 0.1). The position embedding of sentences and opinion words is initialized using xavier uniform distribution, and the dimension is set to 100 dimensions. The weight matrix and offset are also initialized using the xavier uniform distribution. In order to perform dependency syntax analysis, the sentences of both datasets are parsed using Stanford CoreNLP^{Footnote 4}.

In model training, we set the dimension of the hidden state of the LSTM to 100, the dropout to 0.5, and the L2 regularization weight to 0.001. We use the Adam optimizer to optimize the model and set the batch size and the learning rate to 64 and 0.001 respectively.

4.3 Baseline Models

We use several models as our compared models, these baseline models are as follows:

Majority assigns the most frequent emotional polarity in the training set to each sample in the test set. TD-LSTM [2] uses two LSTM networks to model the left and right contexts with the target, which are stitched together as the final representation to predict the emotional polarity of the aspect. AE-LSTM [3] uses LSTM network to model context words, and combines the word hidden state with aspect embedding to supervise the generation of attention vectors. ATAE-LSTM [3] is based on the improvement of AE-LSTM. ATAE-LSTM further enhances the effect of aspect embedding, and adds aspect embedding after each word embedding vector to represent context. PosATT-LSTM [1] introduces position information to model the word’s position, and then combines the hidden state of aspect and position information, supervising the generation of attention vectors.

MemNet [7] captures the correlation between each context word and the depicted aspect through multiple attentions, and focuses the last attention. IAN [8] uses two independent LSTM networks to model aspects and contexts respectively, and uses the average pooling of the hidden state of the context for aspect attention score calculation. RAM [9] is a multi-layer architecture, where each layer includes attention-based word feature aggregation and a GRU unit to learn sentence representation. SHAN [21] captures a synthesized global information and local information with gating mechanism by introducing a global attention score and a grammar-based local attention score respectively.

4.4 Experimental Results and Analysis

We test our model on the laptop and the restaurant datasets, the experimental results are shown in Table 2. As shown in Table 2, we can see that Majority has the worst effectiveness among all models. The LSTM-based models are better than Majority, which shows that LSTM network can effectively generate sentence feature representations to predict the emotional polarity of aspects.

Table 2. Experimental results of different models on the laptop and restaurant datasets.

Full size table

We also can see that using the word position information related to the target plays an important role in generating the final representation. Both PosATT-LSTM and SHAN considered the positions of the words, and the experimental result of the two models are also remarkable. Comparing with ATAE-LSTM and PosATT-LSTM, we can see that PosATT-LSTM increased 4.1% and 2.2% performance in the laptop dataset and restaurant dataset, respectively by using location information. SHAN does not directly use the relative distance between each word and the aspect, but considers the distance based on the syntax, which eliminates a lot of noise to a certain extent and also achieves good results.

Our model combines relative distance and syntactic distance to further improve the performance of the experimental results. Compared with the above baseline models, our model achieve the best performance. In the laptop dataset and restaurant dataset, our model achieve 75.71% and 81.43% accuracy, respectively, which proves the feasibility of our model.

4.5 Ablation Studies

In order to verify the efficiency and advantage of different components of our proposed model, we also carried an ablation test. We use Pos-LSTM denotes that our model just retains the sentence encoding with position embedding without other components. We use Pos-Context-ATT denotes our model retain the context attention component, but without opinion attention component. The ablation test results are reported in Table 3.

Table 3. Experimental results of our model in ablation analysis.

Full size table

As shown in Table 3, we can see that Pos-Context-ATT performs better than that of Pos-LSTM, which has an increase of 2.35% and 1.78% on laptop and restaurant datasets. This indicates that capturing the context information of words across sentence related to the target can actually improve the performance of this task. In addition, compared with Pos-Context-ATT, our final model has an increase of 1.26‬% and 1.61% on laptop and restaurant datasets, which means that the relationship between opinion words and the target is significantly supervised the final representation and improved the prediction results.

4.6 Discussion

To verify the impact of dependency distance of our model, we conducted several experiments with different dependency distances with 1, 2, and 3. The results are shown in Table 4.

Table 4. The impact analysis of dependency distance to our model.

Full size table

It can be observed that the greater the dependency distance, the worse the performance of our model. Compared with the dependency distance of 1, when the dependency distance is 2, the accuracy decreases by 0.95% and 0.81%, and when the dependency distance is 3, the accuracy decreases by 2.2% and 1.79%. We believe that when the dependency distance is too large, it will choose opinion words that are not related to the aspect. And these opinion words introduce a lot of noise, which decrease the performance.

5 Conclusions

Basing on the observation that the independent relationship between opinion words and the target can supervise important sentimental information of the given target, a position-aware hybrid attention network for aspect-level sentiment analysis is proposed in this paper. Our model not only captures the context information of the words related to the target across the sentence, but also obtain the relationship between opinion words and the target. The experimental results carried on the public dataset show that our model is more effective than the compared baseline models.

Although hybrid attention proposed in our model achieve good performance, we find that the information of opinion attention is not well used in context attention. In the following research, we will focus on the interaction between the opinion words and the context of the content. We hope that opinion words are helpful to supervise the generation of attention scores in the context, which can make the model focus on context words related to opinion words.

Notes

References

Zeng, J., Ma, X., Zhou, K.: Enhancing attention-based LSTM with position context for aspect-level sentiment classification. IEEE Access 7, 20462–20471 (2019)
Article Google Scholar
Tang, D., Qin, B., Feng, X., Liu, T.: Effective LSTMs for target-dependent sentiment classification. In: COLING 2016, pp. 3298–3307 (2016)
Google Scholar
Wang, Y., Huang, M., Zhao, L., Zhu, X.: Attention-based LSTM for aspect-level sentiment classification. In: EMNLP 2016, pp. 606–615 (2016)
Google Scholar
Kiritchenko, S., Zhu, X., Cherry, C., Mohammad, S.M.: Detecting aspects and sentiment in customer reviews. In: SemEval@COLING 2014, pp. 437–442 (2014)
Google Scholar
Wagner, J., Arora, P., Cortes, S., Barman, U., Bogdanova, D., Foster, J., Tounsi, L.: Aspect-based polarity classification for SemEval task 4. In: SemEval@COLING 2014, pp. 223–229 (2014)
Google Scholar
Vo, D.-T., Zhang, Y.: Target-dependent twitter sentiment classification with rich automatic features. In: IJCAI 2015, pp. 1347–1353 (2015)
Google Scholar
Tang, D., Qin, B., Liu, T.: Aspect level sentiment classification with deep memory network. In: EMNLP 2016, pp. 214–224 (2016)
Google Scholar
Ma, D., Li, S., Zhang, X., Wang, H.: Interactive attention networks for aspect-level sentiment classification. In: IJCAI 2017, pp. 4068–4074 (2017)
Google Scholar
Chen, P., Sun, Z., Bing, L., Yang, W.: Recurrent attention network on memory for aspect sentiment analysis. In: EMNLP 2017, pp. 452–461 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS 2017, pp. 5998–6008 (2017)
Google Scholar
Zheng, S., Xia, R.: Left-center-right separated neural network for aspect-based sentiment analysis with rotatory attention. CoRR abs/1802.00892 (2018)
Google Scholar
Tay, Y., Tuan, L.A., Hui, S.C.: Learning to attend via word-aspect associative fusion for aspect-based sentiment analysis. In: AAAI 2018, pp. 5956–5963 (2018)
Google Scholar
Huang, B., Ou, Y., Carley, K.M.: Aspect level sentiment classification with attention-over-attention neural networks. In: SBP-BRiMS 2018, pp. 197–206 (2018)
Google Scholar
Xue W, Li T.: Aspect Based Sentiment Analysis with Gated Convolutional Networks.. In: ACL 2018. Association for Computational Linguistics, vol. 1, pp. 2514–2523 (2018)
Google Scholar
Liu, F., Cohn, T., Baldwin, T.: Recurrent entity networks with delayed memory update for targeted aspect-based sentiment analysis. In: NAACL-HLT 2018, vol. 2, pp. 278–283 (2018)
Google Scholar
Majumder, N., Poria, S., Gelbukh, A., Akhtar, S., Cambria, E., Ekbal, A.: IARM: inter-aspect relation modeling with memory networks in aspect-based sentiment analysis. In: EMNLP 2018, pp. 3402–3411 (2018)
Google Scholar
Wu, S., Xu, Y., Wu, F., Yuan, Z., Huang, Y., Li, X.: Aspect-based sentiment analysis via fusing multiple sources of textual knowledge. Knowl. Based Syst. 183, 104868 (2019)
Article Google Scholar
Liang, B., Du, J., Xu, R., Li, B., Huang, H.: Context-aware embedding for targeted aspect-based sentiment analysis. In: ACL 2019, vol. 1, pp. 4678–4683 (2019)
Google Scholar
Bao, L., Lambert, P., Badia, T.: Attention and lexicon regularized LSTM for aspect-based sentiment analysis. In: ACL 2019, vol. 2, pp. 253–259 (2019)
Google Scholar
He, R., Lee, W.S., Ng, H.T., Dahlmeier, D.: An interactive multi-task learning network for end-to-end aspect-based sentiment analysis. In: ACL 2019, vol.1, pp. 504–515 (2019)
Google Scholar
Wang, X., Xu, G., Zhang, J., Sun, X., Wang, L., Huang, T.: Syntax-directed hybrid attention network for aspect-level sentiment analysis. IEEE Access 7, 5014–5025 (2019)
Article Google Scholar
Li, L., Liu, Y., Zhou, A.: Hierarchical attention based position-aware network for aspect-level sentiment analysis. In: CoNLL 2018, pp. 181–189 (2018)
Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: KDD 2004, pp. 168–177 (2004)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP 2014, pp. 1532–1543 (2014)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS 2013, pp. 3111–3119 (2013)
Google Scholar
Hochreiter, S., Urgen Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar

Download references

Acknowledgements

This work is supported by National Nature Science Foundation of China (61976062), the Science and Technology Program of Guangzhou, China (No. 201904010303 and No. 202002030227) and the Special Funds for the Cultivation of Guangdong College Students’ Scientific and Technological Innovation (“Climbing Program” Special Funds, grant number: pdjh2019b0173).

Author information

Authors and Affiliations

School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou, China
Yongqiang Zheng, Xia Li, Guixin Su, Junteng Ma & Chaolin Ning
Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangzhou, China
Xia Li

Authors

Yongqiang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xia Li
View author publications
You can also search for this author in PubMed Google Scholar
Guixin Su
View author publications
You can also search for this author in PubMed Google Scholar
Junteng Ma
View author publications
You can also search for this author in PubMed Google Scholar
Chaolin Ning
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xia Li .

Editor information

Editors and Affiliations

Renmin University of China, Beijing, China
Zhicheng Dou
Xidian University, Xi’an, Shaanxi, China
Qiguang Miao
Wuhan University, Wuhan, Hubei, China
Wei Lu
Tsinghua University, Beijing, China
Jiaxin Mao
Xidian University, Xi’an, Shaanxi, China
Guang Jia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, Y., Li, X., Su, G., Ma, J., Ning, C. (2020). Position-aware Hybrid Attention Network for Aspect-Level Sentiment Analysis. In: Dou, Z., Miao, Q., Lu, W., Mao, J., Jia, G. (eds) Information Retrieval. CCIR 2020. Lecture Notes in Computer Science(), vol 12285. Springer, Cham. https://doi.org/10.1007/978-3-030-56725-5_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-56725-5_7
Published: 10 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-56724-8
Online ISBN: 978-3-030-56725-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Position-aware Hybrid Attention Network for Aspect-Level Sentiment Analysis

Abstract

Similar content being viewed by others

Target-Based Attention Model for Aspect-Level Sentiment Analysis

Effective Self Attention Modeling for Aspect Based Sentiment Analysis

Aspect-location attention networks for aspect-category sentiment analysis in social media

Keywords

1 Introduction

2 Related Work