Transformer-Based Multi-aspect Modeling for Multi-aspect Multi-sentiment Analysis

Wu, Zhen; Ying, Chengcan; Dai, Xinyu; Huang, Shujian; Chen, Jiajun

doi:10.1007/978-3-030-60457-8_45

Zhen Wu¹²,
Chengcan Ying¹²,
Xinyu Dai ORCID: orcid.org/0000-0002-4139-7337¹²,
Shujian Huang¹² &
…
Jiajun Chen¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12431))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2246 Accesses
6 Citations

Abstract

Aspect-based sentiment analysis (ABSA) aims at analyzing the sentiment of a given aspect in a sentence. Recently, neural network-based methods have achieved promising results in existing ABSA datasets. However, these datasets tend to degenerate to sentence-level sentiment analysis because most sentences contain only one aspect or multiple aspects with the same sentiment polarity. To facilitate the research of ABSA, NLPCC 2020 Shared Task 2 releases a new large-scale Multi-Aspect Multi-Sentiment (MAMS) dataset. In the MAMS dataset, each sentence contains at least two different aspects with different sentiment polarities, which makes ABSA more complex and challenging. To address the challenging dataset, we re-formalize ABSA as a problem of multi-aspect sentiment analysis, and propose a novel Transformer-based Multi-aspect Modeling scheme (TMM), which can capture potential relations between multiple aspects and simultaneously detect the sentiment of all aspects in a sentence. Experiment results on the MAMS dataset show that our method achieves noticeable improvements compared with strong baselines such as BERT and RoBERTa, and finally ranks the 2nd in NLPCC 2020 Shared Task 2 Evaluation.

Z. Wu and C. Ying—Authors contributed equally.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Modeling Inter-aspect Relationship with Conjunction for Aspect-Based Sentiment Analysis

A complete framework for aspect-level and sentence-level sentiment analysis

Article 06 April 2022

Modeling Multi-aspect Relationship with Joint Learning for Aspect-Level Sentiment Classification

Keywords

1 Introduction

Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task, which aims to detect the sentiment polarity towards one given aspect in a sentence [14, 17, 20]. The given aspect usually refers to the aspect term or the aspect category. An aspect term is a word or phrase explicitly mentioned in the sentence representing the feature or entity of products or services. Aspect categories are pre-defined coarse-grained aspect descriptions, such as food, service, and staff in restaurant review domain. Therefore, ABSA contains two subtasks, namely Aspect Term Sentiment Analysis (ATSA) and Aspect Category Sentiment Analysis (ACSA). Figure 1 shows an example for ATSA and ACSA. Given the sentence “The salmon is tasty while the waiter is very rude”, the sentiments toward the two aspect terms “salmon” and “waiter” are respectively positive and negative. ACSA is to detect the sentiment polarity towards the given pre-defined aspect category, which is explicitly or implicitly expressed in the sentence. There are two aspect categories in the sentence of Fig. 1, i.e., food and waiter, and their sentiments are respectively positive and negative. Note that the annotations for ATSA and ACSA can be separated.

To study ABSA, several public datasets are constructed, including multiple SemEval Challenges datasets [18,19,20] and Twitter dataset [5]. However, in these datasets, most sentences consist of only one aspect or multiple aspects with the same sentiment polarity, which makes ABSA degenerate to sentence-level sentiment analysis [9]. For example, there are only 0.09% instances in Twitter dataset belonging to the case of multi-aspects with different sentiment polarities. To promote the research of ABSA, NLPCC 2020 Shared Task 2 releases a Multi-Aspect Multi-Sentiment (MAMS) dataset. In the MAMS dataset, each sentence consists of at least two aspects with different sentiment polarities. Obviously, the property of multi-aspect multi-sentiment makes the proposed dataset more challenging compared with existing ABSA datasets.

To deal with ABSA, recent works employ neural networks and achieve promising results in previous datasets, such as attention networks [6, 16, 25], memory networks [2, 22], and BERT [9]. These works separate multiple aspects of a sentence into several instances and process one aspect each time. As a result, they only consider local sentiment information for the given aspect while neglecting the sentiments of other aspects in the same sentence as well as the relations between multiple aspects. This setting is unsuitable, especially for the new MAMS dataset, as multiple aspects of a sentence usually have different sentiment polarities in the MAMS dataset, and knowing sentiment of a certain aspect can help infer sentiments of other aspects. To address the issue, we re-formalize ABSA as a task of multi-aspect sentiment analysis, and propose a Transformer-based Multi-aspect Modeling method (TMM) to simultaneously detect the sentiment polarities of all aspects in a sentence. Specifically, we adopt the pre-trained RoBERTa [15] as backbone network and build a multi-aspect scheme for MAMS based on transformer [23] architecture, then employ multi-head attention to learn the sentiment and relations of multi-aspects. Compared with existing works, our method has three advantages:

1.
It can capture sentiments of all aspects synchronously in a sentence and relations between them, thereby avoid focusing on sentiment information belonging to other aspects mistakenly.
2.
Modeling multi-aspect simultaneously can improve computation efficiency largely without additional running resources.
3.
Our method applies the strategy of transfer learning, which exploits large-scale pre-trained semantic and syntactic knowledge to benefit the downstream MAMS task.

Finally, our proposed method obtains obvious improvements for both ATSA and ACSA in the MAMS dataset, and rank the second place in the NLPCC 2020 Shared Task 2 Evaluation.

2 Proposed Method

In this section, we first re-formalize the ABSA task, then present our proposed Transformer-based Multi-aspect Modeling scheme for ATSA and ACSA. The final part introduces the fine-tuning and training objective.

2.1 Task Formalization

Prior studies separate multiple aspects and formalize ABSA as a problem of sentiment classification toward one given aspect a in the sentence $s=\{w_1, w_2, \cdots , w_n\}$. In ATSA, the aspect term a is a span of the sentence s representing the feature or entity of products or services. For ACSA, the aspect category $a\in A$ and A is the pre-defined aspect set, i.e., {food, service, staff, price, ambience, menu, place, miscellaneous} for the new MAMS dataset. The goal of ABSA is to assign a sentiment label $y\in C$ to the aspect a of the sentence s, where C is the set of sentiment polarities (i.e., positive, neural and negative).

In this work, we re-formalize ABSA as a task of multi-aspect sentiment classification. Given a sentence $s=\{w_1, w_2, \cdots , w_n\}$ and m aspects $\{a_1, a_2, \cdots , a_m\}$ mentioned in s, the objective of MAMS is to simultaneously detect the sentiment polarities $\{y_1, y_2, \cdots , y_m\}$ of all aspects $\{a_1, a_2, \cdots , a_m\}$, where $y_i$ corresponds to the sentiment label of the aspect $a_i$.

2.2 Transformer-Based Multi-aspect Modeling for ATSA

Recently, Bidirectional Encoder Representations from Transformers (BERT) [4] achieves great success by pre-training a language representation model on large-scale corpora then fine-tuning on downstream tasks. When fine-tuning on classification tasks, BERT uses the specific token [CLS] to obtain task-specific representation, then applies one additional output layer for classification. For ABSA, previous work concatenates the given single aspect and the original sentence as the input of BERT encoder, then leverages the representation of [CLS] for sentiment classification [9].

Inspired by BERT, we design a novel Transformer-based Multi-Aspect Modeling scheme (TMM) to address MAMS task with simultaneously detecting the sentiments of all aspects in a sentence. Here we take ATSA subtask as example to elaborate on it. Specifically, given a sentence $\{w_1, \cdots , a_1, \cdots , a_m, \cdots , w_n\}$, where the aspect terms are denoted in the original sentence for the ease of following description, we propose two specific tokens [AS] and [AE] to respectively represent the start position and end position of aspect in the sentence. With the two tokens, the original sentence $\{w_1, \cdots , a_1, \cdots , a_m, \cdots , w_n\}$ can be transformed into the sequence $\{w_1, \cdots , {{\mathtt {[AS]}}}, a_1, {{\mathtt {[AE]}}}, \cdots , {{\mathtt {[AS]}}}, a_m, {{\mathtt {[AE]}}}, \cdots , w_n\}$. Based on this new input sequence, we then employ multi-layer transformer to automatically learn the sentiments and relations between multiple aspects.

As shown in Fig. 2, we finally fetch the representation $\mathbf {H}_\mathrm{{[AS]}}$ of the start token [AS] of each aspect as feature vector to classify the sentiment of aspect.

2.3 Transformer-Based Multi-aspect Modeling for ACSA

Since aspect categories are pre-defined and may be not mentioned explicitly in the sentence, the above TMM scheme needs some modifications for ACSA. Given the sentence $s=\{w_1, w_2, \cdots , w_n\}$ and aspect categories $\{a_1, a_2, \cdots , a_m\}$ in s, we concatenate the sentence and aspect categories, and only use the token [AS] to separate multiple aspects because each aspect category is a single word, finally forming the input sequence $\{w_1, w_2, \cdots , w_n, {\mathtt {[AS]}}, a_1, {\mathtt {[AS]}}, a_2, \cdots , {\mathtt {[AS]}}, a_m\}$. As Fig. 3 shows, after multi-layer transformer, we use the representation $\mathbf {H}_\mathrm{{[AS]}}$ the indication token [AS] of each aspect category for sentiment classifcation.

2.4 Fine-Tuning and Training Objective

As aforementioned, we adopt the pre-trained RoBERTa as backbone network, then fine-tune it on the MAMS dataset with the proposed TMM scheme. RoBERTa is a robustly optimized BERT approach and pre-trained with the larger corpora and batch size.

When in the fine-tuning stage, we employ a softmax classifier to map the representation $\mathbf {H}^i_\mathrm{{[AS]}}$ of aspect $a_i$ into the sentiment distribution $\hat{\mathbf {y}}_i$ as follow:

$$\begin{aligned} \hat{\mathbf {y}}_i=\mathrm {softmax}(\mathbf {W}_o\mathbf {H}^i_\mathrm{{[AS]}}+\mathbf {b}_o), \end{aligned}$$

(1)

where $\mathbf {W}_o$ and $\mathbf {b}_o$ respectively denote weight matrix and bias.

Finally, we use cross-entropy loss between predicted sentiment label and the golden sentiment label as training loss, which is defined as follows:

$$\begin{aligned} Loss=- \sum _{s\in D}\sum _{i=1}^{m}\sum _{j\in C}\mathbb {I}(y_i=j) \log \hat{y}_{i,j}, \end{aligned}$$

(2)

where s and D respectively denote a sentence and training dataset, m represents the number of aspects in the sentence s, C is the sentiment label set, $y_i$ denotes the ground truth sentiment of aspect $a_i$ in s, and $\hat{y}_{i,j}$ is the predicted probability of the j-th sentiment towards the aspect $a_i$ in the input sentence.

3 Experiment

3.1 Dataset and Metrics

Similar to SemEval 2014 Restaurant Review dataset [20], the original sentences in NLPCC 2020 Shared Task 2 are from the Citysearch New York dataset [7]. Each sentence is annotated with three experienced researchers working on natural language processing. In the released MAMS dataset, the annotations for ATSA and ACSA are separated. For ACSA, they pre-defined eight coarse-grained aspect categories, i.e., food, service, staff, price, ambience, menu, place, and miscellaneous. The sentences consisting of only one aspect or multiple aspects with the same sentiment polarities are deleted, thus each sentence at least contains two aspects with different sentiments. This property makes the MAMS dataset more challenging. The statistics of the MAMS dataset are shown in Tabel 1.

Table 1. Statistics of the MAMS dataset. Sen. and Asp. respectively denotes the numbers of sentences and given aspects in the dataset. Ave. represents the average number of aspects in each sentence. Pos., Neu. and Neg. respectively indicate the numbers of positive, neutral and negative sentiment.

Full size table

NLPCC 2020 Shared Task 2 uses Macro-F1 to evaluate the performance of different systems, which is calculated as follows:

$$\begin{aligned} Precision (P)&= TP/(TP+FP),\end{aligned}$$

(3)

$$\begin{aligned} Recall (R)&= TP/(TP+FN),\end{aligned}$$

(4)

$$\begin{aligned} F1&= 2*P*R/(P+R), \end{aligned}$$

(5)

where TP represents true positives, FP represents false positives, TN represents true negatives, and FN represents false negatives. Macro-F1 value is the average of F1 value of each category. The final evaluation result is the average result of Macro-F1 values on the two subtasks (i.e., ATSA and ACSA). In this work, we also use standard Accuracy as the metric to evaluate different methods.

3.2 Experiment Settings

We use pre-trained RoBERTa as backbone network, then fine-tune it on downstream ATSA or ACSA subtask with our proposed Transformer-based Multi-aspect Modeling scheme. The RoBERTa has 24 layers of transformer blocks, and each block has 16 self-attention heads. The dimension of hidden size is 1024. When fine-tuning on ATSA or ACSA, we apply Adam optimizer [10] to update model parameters. The initial learning rate is set to 1e-5, and the mini-batch size is 32. We use the official validation set for hyperparameters tuning. Finally, we run each model 3 times and report the average results on the test set.

3.3 Compared Methods

To evaluate the performance of different methods, we compare our RoBERTa-TMM method with the following baselines on ATSA and ACSA.

LSTM: We use the vanilla LSTM to encode sentence and apply the average of all hidden states for sentiment classification.
TD-LSTM: TD-LSTM [21] employs two LSTM networks respectively to encode the left context and right context of the aspect term, then concatenates them for sentiment classification.
AT-LSTM: AT-LSTM [25] uses the aspect representation as query, and employs the attention mechanism to capture aspect-specific sentiment information. For ATSA, the aspect term representation is the average of word vectors in the aspect term. For ACSA, the aspect category representation is randomly initialized and learned in the training stage.
ATAE-LSTM: ATAE-LSTM [25] is an extension of AT-LSTM. It concatenates the aspect representation and word embedding as the input of LSTM.
BiLSTM-Att: BiLSTM-Att is our implemented model similar to AT-LSTM, which uses bidirectional LSTM to encode the sentence and applies aspect attention to capture the aspect-dependent sentiment.
IAN: IAN [16] applies two LSTM to respectively encode the sentence and aspect term, then proposes the interactive attention to learn representations of the sentence and aspect term interactively. Finally, the two representations are concatenated for sentiment prediction.
RAM: RAM [2] employs BiLSTM to build memory and then applies GRU-based multi-hops attention to generate the aspect-dependent sentence representation for predicting the sentiment of the given aspect.
MGAN: MGAN [6] proposes fine-grained attention mechanism to capture the word-level interaction between aspect term and context, then combines it with coarse-grained attention for ATSA.

In addition, we also compare strong transformer-based models including $\text {BERT}_\text {BASE}$ and RoBERTa. They adopt the conventional ABSA scheme and deal with one aspect each time.

$\mathbf{BERT} _\mathbf{BASE} $: $\text {BERT}_\text {BASE}$ [4] has 12 layers transformer blocks, and each block has 12 self-attention heads. When fine-tuning for ABSA, it concatenates the aspect and the sentence to form segment pair, then use the representation of the [CLS] token after multi-layer transformers for sentiment classification.
RoBERTa: RoBERTa [15] is a robustly optimized BERT approach. It replaces the static masking in BERT with dynamic masking, removes the next sentence prediction, and pre-trains with larger batches and corpora.

Table 2. Main experiment results on ATSA and ASCA (%). The results with the marker $^*$ are from official evaluation and they do not provide accuracy performance.

Full size table

3.4 Main Results and Analysis

Table 2 gives the results of different methods on two subtasks of ABSA.

The first part shows the performance of non-transformer-based baselines. We can observe that the vanilla LSTM performs very pool in this new MAMS dataset, because it does not consider any aspect information and is a sentence-level sentiment classification model. In fact, LSTM can obtain pretty good results on previous ABSA datasets, which reveals the challenge of the MAMS dataset. Compared with other attention-based models, RAM and MGAN achieve better performance on ATSA, which validates the effectiveness of multi-hops attention and multi-grained attention for detecting the sentiment of aspect. It is surprising that the TD-LSTM obtains competitive results among non-transformer-based baselines. This result indicates that modeling position information of aspect term may be crucial for the MAMS dataset.

The second part gives two strong baselines, i.e., $\text {BERT}_\text {BASE}$ and RoBERTa. They follow the conventional ABSA scheme and deal with one aspect each time. It is observed that they outperform the non-transformer-based models significantly, which shows the power of pre-trained language models. Benefiting from the larger datasets, batch size and the more parameters, RoBERTa obtains better performance than $\text {BERT}_\text {BASE}$ on ATSA and ACSA.

Compared with the strongest baseline RoBERTa, our proposed Transformer-based Multi-aspect Modeling method RoBERTa-TMM still achieves obvious improvements in the challenging MAMS dataset. Specifically, it outperforms RoBERTa by 1.93% and 1.91% respectively in accuracy and F1-score for ATSA. In terms of ACSA, the improvement of RoBERTa-TMM against RoBERTa is relatively limited. This may be attributed to that the predefined aspect categories are abstract and it is challenging to find their corresponding sentiment spans from the sentence even in the multi-aspect scheme. Nevertheless, the improvement in ACSA is still substantial because the data size of the MAMS dataset is sufficient and even large-scale for ABSA research. Finally, our RoBERTa-TMM-based ensemble system achieves 85.24% and 79.41% respectively for ATSA and ACSA in F1-score, and ranks the 2nd in NLPCC 2020 Shared Task 2 Evaluation.

3.5 Case Study

To further validate the effectiveness of the proposed TMM scheme, we take a sentence from ATSA as example, and average the attention weight of different heads in RoBERTa-TMM and RoBERTa models, finally visualize them in Fig. 4.

From the results of attention visualization, we can see that the two aspect terms in the RoBERTa-TMM model capture the corresponding sentiment spans correctly through multi-aspect modeling. In contrast, given the aspect term “Food”, RoBERTa mistakenly focuses on the sentiment spans of the other aspect term “fish” due to lacking other aspects information, thus making wrong sentiment prediction. The attention visualization indicates that the RoBERTa-TMM can detect the corresponding sentiment spans of different aspects and avoid wrong attention as much as possible by simultaneously modeling multi-aspect and considering the potential relations between multiple aspects.

4 Related Work

4.1 Aspect-Based Sentiment Analysis

Aspect-based sentiment analysis (ABSA) has been studied in the last decade. Early works devote to designing effective hand-crafted features, such as n-gram features [8, 11] and sentiment lexicons [24]. Motivated by the success of deep learning in many tasks [1, 3, 12], recent works adopt neural network-based methods to automatically learn low-dimension and continuous features for ABSA. [21] separates the sentence into the left context and right context according to the aspect term, then employs two LSTM networks respectively to encode them from the two sides of sentence to the aspect term. To capture aspect-specific context, [25] proposes the aspect attention mechanism to aggregate important sentiment information from the sentence toward the given aspect. Following the idea, [16] introduces the interactive attention networks (IAN) to learn attentions in context and aspect term interactively, and generates the representations for aspect and context words separately. Besides, some works employ memory network to detect more powerful sentiment information with multi-hops attention and achieve promising results [2, 22]. Instead of the recurrent network, [26] proposes the aspect information as the gating mechanism based on convolutional neural network, and dynamically selects aspect-specific information for aspect sentiment detection. Subsequently, BERT based method achieves state-of-the-art performance for the ABSA task [9].

However, the above methods perform ABSA with the conventional scheme that separates multiple aspects in the same sentence and analyzes one aspect each time. They only consider local sentiment information for the given aspect and possibly focus on sentiment information belonging to other aspects mistakenly. In contrast, our proposed Transformer-based Multi-aspect Modeling scheme (TMM) aims to learn sentiment information and relations between multiple aspects for better prediction.

4.2 Pre-trained Language Model

Recently, substantial works have shown that pre-trained language models can learn universal language representations, which are beneficial for downstream NLP tasks and can avoid training a new model from scratch [4, 13, 15, 27]. These pre-trained models, e.g., GPT, BERT, XLNet, RoBERTa, use the strategy of first pre-training then fine-tuning and achieve the great success in many NLP tasks. To be specific, they first pre-train some self-supervised objectives, such as the masked language model (MLM), next sentence prediction (NSP), or sentence order prediction (SOP) [13] on the large corpora, to learn complex semantic and syntactic pattern from raw text. When fine-tuning on downstream tasks, they generally employ one additional output layer to learn task-specific knowledge.

Following the successful learning paradigm, in this work, we employ RoBERTa as the backbone network, then fine-tune it with the TMM scheme on the MAMS dataset to perform ATSA and ACSA.

5 Conclusion

Facing the challenging MAMS dataset, we re-formalize ABSA as a task of multi-aspect sentiment analysis in this work and propose a novel Transformer-based Multi-aspect Modeling scheme (TMM) for MAMS, which can determine the sentiments of all aspects in a sentence simultaneously. Specifically, TMM transforms the original sentence and constructs a new multi-aspect sequence scheme, then apply multi-layer transformers to automatically learn to sentiments clues and potential relations of multiple aspects in a sentence. Compared with previous works that analyze one aspect each time, our TMM scheme not only helps improve computation efficiency but also achieves substantial improvements in the MAMS dataset. Finally, our method achieves the second place in NLPCC 2020 Shared Task 2 Evaluation. Experiment results and analysis also validate the effectiveness of the proposed method.

References

Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. In: NeurIPS 2000, Denver, CO, USA, pp. 932–938 (2000)
Google Scholar
Chen, P., Sun, Z., Bing, L., Yang, W.: Recurrent attention network on memory for aspect sentiment analysis. In: EMNLP 2017, Copenhagen, Denmark, 9–11 September 2017, pp. 452–461 (2017)
Google Scholar
Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Speech Audio Process. 20(1), 30–42 (2012)
Article Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019)
Google Scholar
Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., Xu, K.: Adaptive recursive neural network for target-dependent twitter sentiment classification. In: ACL 2014, 22–27 June 2014, Baltimore, MD, USA, Volume 2: Short Papers, pp. 49–54 (2014)
Google Scholar
Fan, F., Feng, Y., Zhao, D.: Multi-grained attention network for aspect-level sentiment classification. In: EMNLP 2018, Brussels, Belgium, 31 October–4 November 2018, pp. 3433–3442 (2018)
Google Scholar
Ganu, G., Elhadad, N., Marian, A.: Beyond the stars: improving rating predictions using review text content. In: WebDB 2009, Providence, Rhode Island, USA, 28 June 2009 (2009)
Google Scholar
Jiang, L., Yu, M., Zhou, M., Liu, X., Zhao, T.: Target-dependent twitter sentiment classification. In: ACL 2011, 19–24 June 2011, Portland, Oregon, USA, pp. 151–160 (2011)
Google Scholar
Jiang, Q., Chen, L., Xu, R., Ao, X., Yang, M.: A challenge dataset and effective models for aspect-based sentiment analysis. In: EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 6279–6284 (2019)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)
Google Scholar
Kiritchenko, S., Zhu, X., Cherry, C., Mohammad, S.: NRC-Canada-2014: detecting aspects and sentiment in customer reviews. In: SemEval@COLING 2014, Dublin, Ireland, 23–24 August 2014, pp. 437–442 (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NeurIPS 2012, 3–6 December 2012, Lake Tahoe, Nevada, United States, pp. 1106–1114 (2012)
Google Scholar
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020 (2020)
Google Scholar
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5, 1–67 (2012)
Article Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR (2019)
Google Scholar
Ma, D., Li, S., Zhang, X., Wang, H.: Interactive attention networks for aspect-level sentiment classification. In: IJCAI 2017, Melbourne, Australia, 19–25 August 2017, pp. 4068–4074 (2017)
Google Scholar
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2007)
Article Google Scholar
Pontiki, M., et al.: SemEval-2016 task 5: aspect based sentiment analysis. In: SemEval@NAACL-HLT 2016, San Diego, CA, USA, 16–17 June 2016, pp. 19–30 (2016)
Google Scholar
Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., Androutsopoulos, I.: SemEval-2015 task 12: aspect based sentiment analysis. In: SemEval@NAACL-HLT 2015, Denver, Colorado, USA, 4–5 June 2015, pp. 486–495 (2015)
Google Scholar
Pontiki, M., et al.: SemEval-2014 task 4: aspect based sentiment analysis. In: SemEval@COLING 2014, Dublin, Ireland, 23–24 August 2014, pp. 27–35 (2014)
Google Scholar
Tang, D., Qin, B., Feng, X., Liu, T.: Effective LSTMs for target-dependent sentiment classification. In: COLING 2016, 11–16 December 2016, Osaka, Japan, pp. 3298–3307 (2016)
Google Scholar
Tang, D., Qin, B., Liu, T.: Aspect level sentiment classification with deep memory network. In: EMNLP 2016, Austin, Texas, USA, 1–4 November 2016, pp. 214–224 (2016)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 5998–6008 (2017)
Google Scholar
Vo, D., Zhang, Y.: Target-dependent twitter sentiment classification with rich automatic features. In: IJCAI 2015, Buenos Aires, Argentina, 25–31 July 2015, pp. 1347–1353 (2015)
Google Scholar
Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: EMNLP 2016, Austin, Texas, USA, 1–4 November 2016, pp. 606–615 (2016)
Google Scholar
Xue, W., Li, T.: Aspect based sentiment analysis with gated convolutional networks. In: ACL 2018, Melbourne, Australia, 15–20 July 2018, Volume 1: Long Papers, pp. 2514–2523 (2018)
Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp. 5754–5764 (2019)
Google Scholar

Download references

Acknowledgements

This work was supported by the NSFC (No. 61976114, 61936012) and National Key R&D Program of China (No. 2018YFB1005102).

Author information

Authors and Affiliations

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Zhen Wu, Chengcan Ying, Xinyu Dai, Shujian Huang & Jiajun Chen

Authors

Zhen Wu
View author publications
You can also search for this author in PubMed Google Scholar
Chengcan Ying
View author publications
You can also search for this author in PubMed Google Scholar
Xinyu Dai
View author publications
You can also search for this author in PubMed Google Scholar
Shujian Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinyu Dai .

Editor information

Editors and Affiliations

ECE & Ingenuity Labs Research Institute, Queen’s University, Kingston, ON, Canada
Xiaodan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Min Zhang
School of Computer Science and Technology, Soochow University, Suzhou, China
Yu Hong
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Z., Ying, C., Dai, X., Huang, S., Chen, J. (2020). Transformer-Based Multi-aspect Modeling for Multi-aspect Multi-sentiment Analysis. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12431. Springer, Cham. https://doi.org/10.1007/978-3-030-60457-8_45

Download citation

DOI: https://doi.org/10.1007/978-3-030-60457-8_45
Published: 02 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60456-1
Online ISBN: 978-3-030-60457-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)