CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior

Gao, Pan; Han, Donghong; Zhou, Rui; Zhang, Xuejiao; Wang, Zikun

doi:10.1007/978-3-031-30675-4_44

Pan Gao¹⁵,
Donghong Han^15,16,
Rui Zhou¹⁷,
Xuejiao Zhang¹⁵ &
…
Zikun Wang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13945))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1780 Accesses
4 Citations

Abstract

Empathy is an important characteristic to be considered when building a more intelligent and humanized dialogue agent. However, existing methods did not fully comprehend empathy as a complex process involving three aspects: cognition, affection and behavior. In this paper, we propose CAB, a novel framework that takes a comprehensive perspective of cognition, affection and behavior to generate empathetic responses. For cognition, we build paths between critical keywords in the dialogue by leveraging external knowledge. This is because keywords in a dialogue are the core of sentences. Building the logic relationship between keywords, which is overlooked by the majority of existing works, can improve the understanding of keywords and contextual logic, thus enhance the cognitive ability. For affection, we capture the emotional dependencies with dual latent variables that contain both interlocutors’ emotions. The reason is that considering both interlocutors’ emotions simultaneously helps to learn the emotional dependencies. For behavior, we use appropriate dialogue acts to guide the dialogue generation to enhance the empathy expression. Extensive experiments demonstrate that our multi-perspective model outperforms the state-of-the-art models.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Fine-Grained Knowledge Enhancement for Empathetic Dialogue Generation

KnowDT: Empathetic dialogue generation with knowledge enhanced dependency tree

Article 20 June 2024

MuSE: A Multi-scale Emotional Flow Graph Model for Empathetic Dialogue Generation

Keywords

1 Introduction

Empathy is the ability to understand others’ feelings, and respond appropriately to their situations . Previous studies have shown that empathetic dialogue models can improve user’s satisfaction in several areas, such as customer service [14], healthcare community [26] and etc. Therefore, how to successfully implement empathy becomes one of the key issues to build an intelligent and considerate agent. In recent years, many studies have been conducted on the task of empathetic dialogue generation, which are mainly divided into two categories: One is to enhance the understanding of a user’s situation and emotion by leveraging knowledge from one or more external knowledge bases [11, 15, 21, 25] or adding emotion causes as prior emotion knowledge [3, 25]. This is to improve the cognitive ability. The issue of the existing work is that they overlook the importance of paths between users’ critical keywords, which can actually reflect the contextual logic in the conversation. Although some studies [25] build paths between emotion concepts and cause concepts, they mainly focus on the causality aspect and ignore the fact that paths between any keywords can help. The second category is to design emotion strategies, such as mixture of experts [12], emotion mimicry [17] and multi-resolution emotions [10] to generate appropriate responses from the affection aspect. Unfortunately, these studies learn to respond properly mainly according to the speaker’s emotion rather than both interlocutors’ emotions. In this paper, we aim to improve the aforementioned weak aspects of the existing works to help advance the study of empathetic dialogue generation.

Psychological research shows that empathy is a complex mental process involving three aspects of interlocutors: cognition, affection and behavior [13]. Specifically, cognitive empathy refers to the ability to understand and interpret a user’s situation [2]; affective empathy is an emotional reaction based on differentiating the emotions of oneself and others [13]; behavioral empathy means verbal or non-verbal forms of communication used in the empathetic dialogue [6]. Among the existing works, some only consider the aspects of congition and affection [21, 28]; others mainly consider the aspect of behavior [1, 27]. None of the existing works had comprehensively considered all the three aspects (cognition, affection, behavior), which we believe are all important. In the following, we elaborate in detail with the example in Fig. 1. The dialogue in Fig. 1 shows that (1) Cognition: The speaker is anxious about attending a job interview. In the first turn, there exists a path between<job, interview> with internship as a bridge to enhance the understanding of the keywords and the context. In the next turn, the paths between < poorly, asked> and < asked, job> are built to alleviate the problem that it is difficult to capture the contextual logic based on limited context. Thus, it can be seen that the paths , which establish the relationships between utterances, are critical to improve the cognitive ability. (2) Affection: In interpersonal conversations, responses are usually influenced by both interlocutors’ emotions [5]. As shown in Fig. 1, in the second turn, instead of both sides falling into anxiety, the listener is able to perceive the speaker’s emotion and accept the emotion difference between them, thus generating a response with more positive emotion (hopeful). Therefore, how to learn the emotional dependencies between the context and target response based on both interlocutors’ emotions is critical for responding properly. (3) Behavior: Appropriate dialogue acts are used as communicative form to enhance empathy expression. For example, the listener inspires the speaker by encouraging and makes the speaker relaxed by wishing. Different from [27], we consider that all the responses (rather than some of them) are generated by the guiding of dialog act. In this way, we can guide dialogue generation better.

To this end, we propose a novel empathetic dialogue generation model including aspects of Cognition, Affection and Behavior (CAB) to achieve a comprehensive empathetic dialogue task. Specifically, since keywords are important to understand the contextual logic, our model builds paths between critical keywords through multi-hop commonsense reasoning to enhance the cognitive ability. Conditional Variational Auto Encoder (CVAE) model with dual latent variables is built based on both interlocutors’ emotions, and then the dual latent variables are injected into the decoder together with the dialogue act features to produce empathetic responses from the perspective of affection and behavior. Our contributions are summarized as follows:

To the best of our knowledge, we are the first to propose a novel framework for empathetic dialogue generation based on psychological theory from three perspectives: cognition, affection and behavior.
We propose a context-based multi-hop reasoning method, in which paths are established between critical keywords to acquire implicit knowledge and learn contextual logic.
We present a novel CVAE model, which introduces dual latent variables to learn the emotional dependencies between the context and target responses. After that, we incorporate the dialogue act features into the decoder to guide the generation.
Experiments demonstrate that CAB generates more relevant and empathetic responses compared with the state-of-the-art methods.^{Footnote 1}

2 Related Work

Recently, there has been numerous works in the task of empathetic dialogue generation proposed by Rashkin et al. [20]. Lin et al. [12] assign different decoders for various emotions, and fuse the output of each decoder with users’ emotion weights. Majumder et al. [17] adopt emotion stochastic sampling and emotion mimicry to respond to positive or negative emotions for generating empathetic responses. Li et al. [10] construct an interactive adversarial learning network considering multi-resolution emotions and user feedback. Liu et al. [16] incorporate anticipated emotions into response generation via reinforcement learning. Gao et al. [3] adopt emotion cause to better understand the user’s emotion. However, all of the above methods only consider the user’s emotion and ignore the influence between both interlocutors’ emotions in the dialogue.

Several studies have incorporated external knowledge into empathetic dialogue generation. Li et al. [11] employ multi-type knowledge to explore implicit information and construct an emotional context graph to improve emotional perception. Liu et al. [15] prepend the retrieved knowledge triples to the gold responses in order to get proper responses. However, these approaches retrieve knowledge triples without fully considering the contextual meaning of the words. Although Wang et al. [25] adopt ConceptNet to explore the emotional causality by commonsense reasoning between the emotion clause and the cause clause, the logical relationships between other utterances may be ignored. Sabour et al. [21] use ATOMIC for commonsense reasoning to better understand the user’s situation and feeling, but reasoning on a whole dialogue history may neglect the important role of keywords in the context. To overcome the above proposed shortcomings, we propose a context-based multi-hop commonsense reasoning method to enrich contextual information and reason about the logical relationships between utterances.

3 Method

3.1 Task Formulation and Overview

In empathetic dialogue generation, each dialogue consists of a dialogue history $C=[S_1,L_1,S_2,L_2,\ldots ,S_{N-1},L_{N-1},S_{N}]$ of 2N-1 utterances and a gold empathetic response $L_N=[w_N^1,w_N^2,\ldots ,w_N^n]$ of n words, where $S_i$ and $L_i$ denote the i-th utterance of speaker and listener respectively. Our goal is to generate an empathetic response $R=[r_1,r_2,\ldots ,r_m]$ based on the dialogue history C, the speaker’s emotion $e_s$, the listener’s emotion $e_l$, and the listener’s dialogue act $a_l$.

We provide a overview of CAB in Fig. 2, which consists of five components: (a) Emotional Context Representation. The predicted emotions, $e_s$ and $e_l$, are fed into context C by emotional context encoder to obtain the emotional context representation $\boldsymbol{\hat{H}}_S$ and $\boldsymbol{\hat{H}}_L$; (b) Affection. Then prior network and posterior network capture dual latent variables $\boldsymbol{z}_s$ and $\boldsymbol{z}_l$, based on $\boldsymbol{\hat{H}}_S$ and $\boldsymbol{\hat{H}}_L$ in the test and training phase; (c) Cognition. To build paths P, we leverage ConceptNet to acquire external knowledge and incorporate it into C to obtain a knowledge-enhanced context representation $\boldsymbol{\hat{H}}_C$; (d) Behavior. The dialogue act features $\boldsymbol{E}_a$ are distilled based on a predictor and the embedding layer; (e) Response Generation. The three-stage decoder generates an empathetic response R based on the aspects of affection, cognition and behavior.

We evaluate the model on EmpatheticDialogues [20], which is a publicly available benchmark dataset for empathetic dialogue generation. However, dialogues in this dataset do not contain labels of emotion and dialogue act for each listener’s utterance, and we annotate emotion and dialogue act by Emoberta [7] and EmoBERT [27], respectively, to support the studies in this paper.

From Sect. 3.2 to Sect. 3.7, we introduce CAB briefly due to space limit. More model and experiment details are in the full version [4].

3.2 Emotional Context Encoder

Input Representation. We divide the dialogue history into two segments $C_S=[S_1,S_2,\ldots ,S_N]$ and $C_L=[L_1,L_2,\ldots ,L_{N-1}]$. Following the previous work [12], we first gain the embedding of speaker context, listener context, global context and gold response respectively. Then the embedding of speaker context and listener context are fed into the Transformer-based inter-encoder (ItrEnc) to obtain $\boldsymbol{H}_S$ and $\boldsymbol{H}_L$, and the Transformer encoder (TransEnc) encodes the embedding of global context and gold response into $\boldsymbol{H}_C$ and $\boldsymbol{H}_N$.

Emotion Classification. To understand the emotions of the speaker and the listener, we project the hidden representations of the first token from $\boldsymbol{H}_S$ and $\boldsymbol{H}_L$ into the emotion category distribution $P_{s}$ and $P_{l}$ to predict their emotions. Then we send the emotions to a trainable emotion embedding layer to obtain the emotion states embedding matrix $\boldsymbol{E}_{emos}$ and $\boldsymbol{E}_{emol}$.

Emotion Self-attention. To make the latent variables in Sect. 3.3 incorporate both interlocutors’ emotions, $\boldsymbol{H}_S$ and $\boldsymbol{H}_L$ are concatenated with $\boldsymbol{E}_{emos}$ and $\boldsymbol{E}_{emol}$ and then fed into a self-attention layer followed by a linear layer to obtain the emotional context representation $\boldsymbol{\hat{H}}_S$ and $\boldsymbol{\hat{H}}_L$.

3.3 Prior Network and Recognition Network (Affection)

We introduce dual latent variables $\boldsymbol{z}_*\in \{\boldsymbol{z}_s,\boldsymbol{z}_l\}$ in CVAE, mapping the input sequences $C_*\in \{C_S,C_L\}$ into the output sequence $L_N$ via $\boldsymbol{z}_*$. Taking speaker as an example, we illustrate how to realize the prior network and the recognition network. The prior network $p_\theta (\boldsymbol{z}_s \vert C_S)$ is parameterized by 3-layer MLPs to compute the mean $\mu _s^\prime $ and variance $\sigma _s^{\prime 2}$ of $\boldsymbol{z}_s$. The network structure of the recognition network $q_\varphi (\boldsymbol{z}_s \vert C_S,L_N)$ is the same as that of the prior network, except that the input also includes $\boldsymbol{H}_N$. In order to learn the emotional dependencies based on both interlocutors’ emotions, we fuse $\boldsymbol{z}_s$ and $\boldsymbol{z}_l$ due to the emotional similarity coefficient $\beta $ between $\boldsymbol{E}_{emos}$ and $\boldsymbol{E}_{emol}$ to obtain $\boldsymbol{z}=\beta \cdot \boldsymbol{z}_s+(1-\beta )\cdot \boldsymbol{z}_l$.

3.4 Knowledge Acquisition and Fusion (Congnition)

Knowledge Acquisition. We first obtain the keyword set $\tau _{all}$ of size $\boldsymbol{cw}$ from $C_S$ based on the TextRank algorithm [18]. Then we build paths as follows:

a. Take one keyword in $\tau _{all}$ as the head entity $h_i\in \tau _{all}$, then feed the embedding of $h_i$ and speaker context into ItrEnc to extract the semantic features of $h_i$. The Top-K knowledge triples in ConceptNet associated with $h_i$ are retrieved based on a score and removed relation set [11].

b. To ensure that the triples are logically related to other keywords $\tau _{other}$, we first obtain the semantic features of $h_j\in \tau _{other}$ like step a. After ranking the triples by relevance between tail entity and $h_j$, we select Top-k triples. If the tail entity is same as $h_j$, which indicates there exists a one-hop path between $h_i$ and $h_j$, we add them to the final keywords set $\tau _r$ (e.g. red circles in Fig. 2). If not, the tail entity is added to $\tau _{all}$ to continue finding the paths by repeating step a and b. Finally, we retain some paths P (e.g. the paths connected by grey arrows in Fig. 2) for futher fusion. The attention weight vector $\boldsymbol{g}$ is calculated to measure importance of each word in C with $\tau _r$ by the attention mechanism.

Knowledge Fusion. We first convert the paths into sequences. Then the sequences are fed into the two-layer Bi-GRU to obtain the knowledge representation $\boldsymbol{H}_k$. Finally, following previous work [21], we concatenate $\boldsymbol{H}_k$ with context at token-level to learn the knowledge-enhanced context representation $\boldsymbol{\hat{H}}_C$.

3.5 Dialogue Act Predictor and Representation (Behavior)

To guide the communicative form of empathetic dialogue generation, our model uses the first token of $\boldsymbol{\hat{H}}_C$ to predict dialogue act $\boldsymbol{a}_l$. Then, $\boldsymbol{a}_l$ is fed into the embedding layer to learn the dialogue act embedding representation $\boldsymbol{E}_a$.

3.6 Response Generation

Finally, the aforementioned information $\boldsymbol{E}_a$, $\boldsymbol{g}$, $\boldsymbol{z}$ and $\boldsymbol{\hat{H}}_{C}$ are applied at the Transformer-based decoder (TransDec) through the following three stages: (1) The embedding of the start-of-sequence token $\boldsymbol{E}_{SOS}$ and $\boldsymbol{E}_a$ are fed into a linear layer, then the high-level act features are adopted to guide the generation. (2) We design a multi-head keywords attention, which takes the output of the cross-attention layer as query, the dot-product over $\boldsymbol{g}$ and $\boldsymbol{\hat{H}}_C$ as key and value. Then TransDec outputs the hidden state $\boldsymbol{H}_G$. (3) To learn the emotional dependencies, we concatenate $\boldsymbol{z}$ and $\boldsymbol{H}_G$ at token-level and use pointer network [23] to output the probability distribution of each word in the vocabulary.

3.7 Training Objectives

We jointly optimiaze the emotion classification loss, dialogue act prediction loss, the loss of CVAE model and bag-of-word loss as:

$$\begin{aligned} \mathcal {L}=\gamma _1 \mathcal {L}_{s}+\gamma _2 \mathcal {L}_{l}+\gamma _3 \mathcal {L}_{a}+ \gamma _4 \mathcal {L}(C_*,C_N;\theta ,\varphi )+\gamma _5 \mathcal {L}_{bow} \end{aligned}$$

(1)

where $\gamma _1$, $\gamma _2$, $\gamma _3$, $\gamma _4$ and $\gamma _5$ are hyper-parameters.

4 Experiments

4.1 Experimental Setup

Baselines. We compare our model with the state-of-the-art models as follows: (1) Transformer [22]: The vanilla Transformer with the pointer network trained by optimizing the generation loss. (2) Multi-Trans [20]: A variant of Transformer that includes emotion classification loss in addition to the generation loss to jointly optimize the model. (3) MOEL [12]: A model that includes several Transformer decoders, and the outputs are softly combined to generate responses. (4) MIME [17]: A model adopting emotion mimicry and emotion clusters to deal with positive or negative emotions. (5) EmpDG [10]: A generative adversarial network that considers multi-resolution emotion and introduces discriminators to supervise the training in semantics and emotion. (6) KEMP [11]: A model that uses two-type knowledge to help understand and express emotions. (7) CEM [21]: A method for generating empathetic responses by leveraging commonsense to improve the understanding of interlocutors’ situations and feelings.

Implementation Details. We implement all models in PyTorch^{Footnote 2} with GeForce GTX 3090 GPU, and train models using Adam optimization [8] with a mini-batch size of 16. All common hyper-parameters are the same as the work in [12]. We adopt 300-dimensional pre-trained 840B GloVE vectors [19] to initialize the word embeddings, which are shared between the encoders and the decoder. The hidden size is 300 everywhere, and the size of latent variable is 200. We use the KL annealing of 15,000 batches to achieve the best performance. During test, the batch size is 1 and the maximum greedy decoding steps is 50.

Automatic Evaluation Metrics. We choose the widely used PPL [24], Distinct-1, Distinct-2 [9] as our main automatic metrics. PPL is used to estimate the generation quality of a model in general. Distinct-1 and Distinct-2 are used to measure the diversity of responses. Since emotion accuracy of speaker/listener (EmoSA/EmoLA) reflects the understanding of both interlocutors’ emotions and dialogue act accuracy (ActA) can determine whether the proper dialogue acts are chosen to produce responses, we also report these metrics.

Table 1. Results of the automatic evaluation, and w/o Cog/Aff/Beh indicate ablation experiments and the best results of all models are bold.

Full size table

4.2 Results and Analysis

Automatic Evaluation Results. The overall automatic evaluation results are shown in the Table 1. Our model CAB outperforms the baselines on all metrics significantly. The lower PPL score implies that CAB has a higher quality of generation generally, reflecting the importance of considering empathy from multi-perspective. The remarkable improvements in Distinct-1 and Distinct-2 suggest that the introduction of external knowledge can be beneficial in improving the understanding of dialogue history and thus generating a wider variety of response. The higher accuracy of emotion classification verifies the validity of modelling both interlocutors’ emotions separately.

Ablation Study. As shown in the bottom part of Table 1, we also conduct ablation experiments to explore the effect of each component. From the results, we can observe that all metrics decrease except for PPL, especially Distinct-1 and Distinct-2, when commonsense knowledge acquisition and fusion are removed (w/o Cog), suggesting that the paths capture additional information to enhance cognitive ability, thus improving the quality and diversity of responses. The increasing PPL score may be due to the introduction of knowledge, which may have an impact on the fluency of the generated responses. In addition, we find that only considering the speaker’s emotion by removing the latent variable of listener (w/o Aff) yields lower emotion accuracy and higher PPL score, and thus it is difficult to generate appropriate responses without understanding both interlocutors’ emotions exactly. All metrics decrease when we remove the classification of dialogue act and the dialogue act features fused at the decoder (w/o Beh), indicating the emphasis of the dialogue acts in improving empathy.

5 Conclusions

In this paper, we build paths by leveraging commonsense knowledge to enhance understanding of the user’s situation, considering both interlocutors’ emotions and guiding responses generation through dialogue act, namely by generating empathetic responses from three perspectives: cognition, affection and behavior. Extensive experiments based on benchmark metrics have shown that our method CAB outperforms the state-of-the-art methods, demonstrating the effectiveness of our method in improving empathy of the generated responses.

Notes

1.
Code and data are available at https://github.com/geri-emp/CAB.
2.
https://pytorch.org/.

References

Chen, M.Y., Li, S., Yang, Y.: EmpHi: generating Empathetic Responses with Human-like Intents. In: NAACL-HLT 2022, pp. 1063–1074 (2022)
Google Scholar
Elliott, R., Bohart, A.C., Watson, J.C., Murphy, D.: Therapist empathy and client outcome: An updated meta-analysis. Psychotherapy 55(4), 399–410 (2018)
Article Google Scholar
Gao, J., et al.: Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations. In: EMNLP 2021, pp. 807–819 (2021)
Google Scholar
Gao, P., Han, D., Zhou, R., Zhang, X., Wang, Z.: CAB: empathetic dialogue generation with cognition, affection and behavior. arXiv preprint arXiv:2302.01935 (2023)
Ghosal, D., Majumder, N., Poria, S., Chhaya, N., Gelbukh, A.F.: DialogueGCN: a graph convolutional neural network for emotion recognition. In: EMNLP-IJCNLP, pp. 154–164 (2019)
Google Scholar
Gladstein, G.A.: Empathy and counseling outcome: an empirical and conceptual review. Couns. Psychol. 6(4), 70–79 (1977)
Article Google Scholar
Kim, T., Vossen, P.: Emoberta: speaker-aware emotion recognition in conversation with roberta. (2021) arXiv preprint arXiv:2108.12009
Kingma, D.P., Ba, J.: Adam: a Method for Stochastic Optimization. In: 3rd International Conference on Learning Representations (2019)
Google Scholar
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A Diversity-Promoting Objective Function for Neural Conversation Models. In: NAACL HLT 2016, pp. 110–119 (2016)
Google Scholar
Li, Q., Chen, H., Ren, Z., Ren, P., Tu, Z., Chen, Z.: EmpDG: multi-resolution Interactive Empathetic Dialogue Generation. In: ICCL-20, pp. 4454–4466 (2020)
Google Scholar
Li, Q., Li, P., Ren, Z., Ren, P., Chen, Z.: Knowledge Bridging for Empathetic Dialogue Generation. In: AAAI-22, pp. 10993–11001 (2022)
Google Scholar
Lin, Z., Madotto, A., Shin, J., Xu, P., Fung, P.: MoEL: Mixture of Empathetic Listeners. In: EMNLP-IJCNLP, pp. 121–132 (2021)
Google Scholar
Liu, C., Wang, Y., Yu, G., Wang, Y.: A review of relevant theories of empathy and exploration of new dynamic models. Adv. Psychol. Sci. (5), 9 (2009)
Google Scholar
Liu, S., Zheng, C., Demasi, O., Sabour, S., Huang, M.: Towards Emotional Support Dialog Systems. In: ACL/IJCNLP(1), pp. 3469–3483 (2021)
Google Scholar
Liu, Y., Maier, W., Minker, W., Ultes, S.: Empathetic dialogue generation with pre-trained roberta-gpt2 and external knowledge. (2021) arXiv preprint arXiv:2109.03004
Liu, Y., Du, J., Li, X., Xu, R.: Generating Empathetic Responses by Injecting Anticipated Emotion. In: ICASSP, pp. 7403–7407 (2021)
Google Scholar
Majumder, N., Hong, P., Peng, S., Lu, J., Poria, S.: MIME: MIMicking Emotions for Empathetic Response Generation. In: EMNLP(1), pp. 8968–8979 (2020)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: bringing Order into Text. In: EMNLP, pp. 404–411 (2004)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global Vectors for Word Representation. In: EMNLP, pp. 1532–1543 (2014)
Google Scholar
Rashkin, H., Smith, E.M., Li, M., Boureau, Y.: Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset. In: ACL 2019, pp. 5370–5381 (2019)
Google Scholar
Sabour, S., Zheng, C., Huang, M.: CEM: commonsense-Aware Empathetic Response Generation. In: AAAI-22, pp. 11229–11237 (2022)
Google Scholar
Vaswani, A., et al.: Attention is All you Need. In: NIPS-17, pp. 5998–6008 (2017)
Google Scholar
Vinyals, O., Fortunato, M., Jaitly, N.: Pointer Networks. In: NEURIPS 2015, pp. 2692–2700 (2015)
Google Scholar
Vinyals, O., Le, Q.V.: A neural conversational model. (2015) arXiv preprintarXiv:1506.05869
Google Scholar
Wang, J., Li, W., Lin, P., Mu, F.: Empathetic response generation through graph-based multi-hop reasoning on emotional causality. Knowl. Based Syst. 233, 107547 (2021)
Article Google Scholar
Wang, L., et al.: Cass: Towards building a social-support chatbot for online health community. Proc. ACM Hum. Comput. Interact. 5(CSCW1), 1–31 (2021)
Google Scholar
Welivita, A., Pu, P.: A Taxonomy of Empathetic Response Intents in Human Social Conversations. In: COLING 2020, pp. 4886–4899 (2020)
Google Scholar
Zheng, C., Liu, Y., Chen, W., Leng, Y., Huang, M.: CoMAE: a Multi-factor Hierarchical Framework for Empathetic Response Generation. In: Findings of the Association for Computational Linguistics: ACL/IJCNLP, pp. 813–824 (2021)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (61672144, 61872072).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Northeastern University, Shenyang, China
Pan Gao, Donghong Han, Xuejiao Zhang & Zikun Wang
Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Shenyang, China
Donghong Han
Swinburne University of Technology, Hawthorn, Australia
Rui Zhou

Authors

Pan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Donghong Han
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xuejiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zikun Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Donghong Han .

Editor information

Editors and Affiliations

Tianjin University, Tianjin, China
Xin Wang
University of Torino, Turin, Italy
Maria Luisa Sapino
POSTECH, Pohang, Korea (Republic of)
Wook-Shin Han
University of California Santa Barbara, Santa Barbara, CA, USA
Amr El Abbadi
University of Auckland, Auckland, New Zealand
Gill Dobbie
Tianjin University, Tianjin, China
Zhiyong Feng
Beijing University of Posts and Telecommunications, Beijing, China
Yingxiao Shao
The University of Queensland, Brisbane, QLD, Australia
Hongzhi Yin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, P., Han, D., Zhou, R., Zhang, X., Wang, Z. (2023). CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13945. Springer, Cham. https://doi.org/10.1007/978-3-031-30675-4_44

Download citation

DOI: https://doi.org/10.1007/978-3-031-30675-4_44
Published: 15 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30674-7
Online ISBN: 978-3-031-30675-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior

Abstract

Similar content being viewed by others

Fine-Grained Knowledge Enhancement for Empathetic Dialogue Generation

KnowDT: Empathetic dialogue generation with knowledge enhanced dependency tree

MuSE: A Multi-scale Emotional Flow Graph Model for Empathetic Dialogue Generation

Keywords

1 Introduction

2 Related Work

3 Method

3.1 Task Formulation and Overview

3.2 Emotional Context Encoder

3.3 Prior Network and Recognition Network (Affection)

3.4 Knowledge Acquisition and Fusion (Congnition)

3.5 Dialogue Act Predictor and Representation (Behavior)

3.6 Response Generation

3.7 Training Objectives

4 Experiments

4.1 Experimental Setup

4.2 Results and Analysis

5 Conclusions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior

Abstract

Similar content being viewed by others

Fine-Grained Knowledge Enhancement for Empathetic Dialogue Generation

KnowDT: Empathetic dialogue generation with knowledge enhanced dependency tree

MuSE: A Multi-scale Emotional Flow Graph Model for Empathetic Dialogue Generation

Keywords

1 Introduction

2 Related Work

3 Method

3.1 Task Formulation and Overview

3.2 Emotional Context Encoder

3.3 Prior Network and Recognition Network (Affection)

3.4 Knowledge Acquisition and Fusion (Congnition)

3.5 Dialogue Act Predictor and Representation (Behavior)

3.6 Response Generation

3.7 Training Objectives

4 Experiments

4.1 Experimental Setup

4.2 Results and Analysis

5 Conclusions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation