Keywords

1 From Task-Oriented Agents to Social Bots

Devices like the Amazon Echo and Google Home have entered our homes to perform task-oriented functions, such as looking up today’s headlines and setting reminders [1, 5]. As these devices evolve, we have begun to expect social conversation, where the device must learn to personalize and produce natural language style.

Social conversation is not explicitly goal-driven in the same way as task-oriented dialogue. Many dialogue systems in both the written and spoken medium have been developed for task-oriented agents with an explicit goal of restaurant information retrieval, booking a flight, diagnosing an IT issue, or providing automotive customer support [6, 8, 18, 23, 25, 26]. These tasks often revolve around question answering, with little “chit-chat”. Templates are often used for generation and state tracking, but since they are optimized for the task at hand, the conversation can either become stale, or maintaining a conversation requires the intractable task of manually authoring many different social interactions that can be used in a particular context.

We argue that a social agent should be spontaneous, and allow for human-friendly conversations that do not follow a perfectly-defined trajectory. In order to build such a conversational dialogue system, we exploit the abundance of human-human social media conversations, and develop methods informed by natural language processing modules that model, analyze, and generate utterances that better suit the context.

2 Data-Driven Models of Human Language

A myriad of social media data has led to the development of new techniques for language understanding from open domain conversations, and many corpora are available for building data-driven dialogue systems [19, 20]. While there are differences between how people speak in person and in an online text-based environment, the social agents we build should not be limited in their language; they should be exposed to many different styles and vocabularies. Online conversations can be repurposed in new dialogues, but only if they can be properly indexed or adapted to the context. Data retrieval algorithms have been successfully employed to co-construct an unfolding narrative between the user and computer [22], and re-use existing conversations [7]. Other approaches train on such conversations to analyze sequence and word patterns, but lack detailed annotations and analysis, such as emotion and humor [10, 21, 24]. The large Ubuntu Dialogue Corpus [12] with over 7 million utterances is large enough to train neural network models [9, 11].

We argue that combining data-driven retrieval with modules for sentiment analysis and style, topic analysis, summarization, paraphrasing, rephrasing, and search will allow for more human-like social conversation [3]. This requires that data be indexed based on domain and requirement, and then retrieve candidate utterances based on dialogue state and context. Likewise, in order to avoid stale and repetitive utterances, we can alter and repurpose the candidate utterances; for example, we can use paraphrase or summarization to create new ways of saying the same thing, or to select utterance candidates according to the desired sentiment [14, 15]. The style of an utterance can be altered based on requirements; introducing elements of sarcasm, or aspects of factual and emotional argumentation styles [16, 17]. Changes in the perceived speaker personality can also make more personable conversations [13]. Even utterances from monologic texts can be leveraged by converting the content to dialogic flow, and performing stylistic transformations [2].

Of course, while many data sources may be of interest for indexing knowledge for a dialogue system, annotations are not always available or easy to obtain. By using machine learning models designed to classify different classes of interest, such as sentiment, sarcasm, and topic, data can be bootstrapped to greatly increase the amount of data available for indexing and utterance selection [17].

There is no shortage of human generated dialogues, but the challenge is to analyze and harness them appropriately for social-dialogue generation. We aim to combine data-driven methods to repurpose existing social media dialogues, in addition to a suite of tools for sentiment analysis, topic identification, summarization, paraphrase, and rephrasing, to develop a socially-adept agent that can carry on a natural conversation [4].