Keywords

1 Introduction

Chatbots have long been an important topic for artificial intelligence [1]: developed language is one of the greatest achievements of human kind and one the most basic means of our communication. It is no wonder that creating an artificial intelligence able of communicating intelligently with us in natural language would be a scientific milestone. As the great philosopher Ludwig Wittgenstein once stated, “The limits of my language mean the limits of my world.” [2] - AI is yet far from having the language comprehension that we possess, but efforts are being made constantly.

The Loebner’s Prize [3] is an annual competition for chatbots creation. Chatbots are put under a specific version of the Turing test, where human judges determine whether their conversation partner is human or AI. The Loebner’s Prize has been a motor for the development of NLP, with specific tools like AIML [4] and ChatScript [5, 6] evolving along the way. The chatbots submitted for the Loebner’s Prize are designed in a way to imitate human conversation capabilities and the test itself is varies in terms of requirements and topics - nevertheless, the graphical design of these bots is irrelevant, as only the “brain” itself is put under judgement [7].

On the other hand, the market has seen an increasing need for chatbots in sectors like AI-aided customer support, management, information exchange, trades, healthcare [8], news, media and other services [9]. Conversational interfaces are seen as faster, easier to learn and more user-friendly than conventional mechanisms. Therefore, the application of chatbots seems to be a separate problem from the main branch of chatbot development, as the usages and requirements differ heavily from the original, Loebner’s Prize approach. Here chatbots can bee seen rather as a blend of the classic, Loebner’s Prize chatbot with various question-answering functionalities contained within a user-friendly interface. The interface itself serves not only as a view for the application, but defines the communication model between the human user and the AI.

Apart from the business aspect, chatbots are currently being embedded as a key element of many modern computer programs, including games. An extremely interesting example of a highly developed chatbot interface can be seen in the game Event [0] by Ocelot Society, where the player plays as an astronaut on an abandoned space station and communicates with a sentient AI by a network of terminals [10, 11]. The gameplay can be furthered only by conversation with the AI, as it controls all the locks and mechanisms on the station. The game has received a fair amount of attention and critical praise, winning numerous game industry prizes (Fig. 1).

Fig. 1.
figure 1

Conversation with the AI in Event [0] (Ocelot Society, 2016)

2 Requirements

When defining the requirements for the graphical interface of a chatbot-based AI system, we need to consider the following questions [13]:

  • what will the chatbot be used for?

  • how will the chatbot be provided to the user? Via SMS-like messaging? Mobile devices? A web interface? An internal desktop application?

  • what conveniences will be provided to the user by the chatbot’s interface?

  • what scope of understanding and competence should the chatbot have?

  • how relevant are the chatbot’s conversation capabilities to its purpose?

A survey regarding chatbots uses in business has been carried out by Usurv in May 2016 [12] on a nationally representative sample of 1,000 UK adults. The survey allows to clearly identify the main needs of the market, as seen by the predicted usages of chatbots, shown in Table 1.

Table 1. Predicted usages of chatbots [12]

The other provided replies include paying bills, expensive purchases and purchase inspiration. The emerging conclusion is further supported by the analysis of expected benefits from utilizing chatbots as intelligent business-aid, as shown in Table 2.

Table 2. Most expected benefits from the usage of chatbots [12]

The audience also raised concerns about the comprehensive capabilities of the chatbots, as “Poor question understanding” was chosen the number one barrier to chatbot usage with 55% of the respondents saying so.

The interesting conclusion therefore is that the most common requirement is speed and providing aid in simple, repetitive tasks - advanced problem solving capabilities, although also desired, have a much lower priority. Therefore, it can be clearly stated, that advanced NLP mechanisms will not always be the solution to the problem, as inside of the narrow domain [14], the chatbot will mostly operate on keywords and rely upon correct identification of the user’s intent and entities regarding that intent. One of the most important factors to consider, though, will be the interface of the chatbot and it’s capabilities of providing quick, clean methods and clearly steering the conversation when needed.

3 Design Principles

The design principles for engaging chatbots rely mostly on the interface design and scripting. The widely accepted standard is a text message (SMS) application or Facebook Messenger type of design [15,16,17,18]. An example may be seen as in Fig. 2.

Fig. 2.
figure 2

Example of a mobile chat application.

The two main modules of the applications are the “contacts view” and the “chat view”. The former stores and displays the recent conversations that the user has had, while the latter displays the current conversation. The layout of the modules follows the modern principles of creating very simple, clean interfaces with a minimal number of active controls on-screen. Another trend is to use responsive layout - many modern tools and libraries (like, for instance, Bootstrap [19] or Vaadin [20]) provide support for this type of interfaces.

Most modern chat applications designed for humans allow to embed audio clips and images into the messages, which opens up additional possibilities: a small UI control, like a date selector or a choice input field, may be embedded is the bot’s message as well. This allows to further reduce the number of active elements on-screen, as the control will receive the user’s attention only for the time of interaction and will be quickly and elegantly scrolled away in the further ensuing conversation. The controls can be specialized in order to increase the convenience of providing additional input data needed to answer the input query. Whenever answering the question requires a set of data to be retrieved, charts and diagrams may be generated and embedded into the reply as well, if only they are handled correctly by the chatbot’s brain.

Multi-bot [21] systems are also an interesting way of handling multiple topics: each time a specialized question is asked by the user, a dispatcher bot “invites” one of his friends to the conversation. The new bot acts as an additional party of the conversation, asks additional questions if needed and provides the final answer to the user. The conversation then returns to a basic one-on-one model. This approach may not be implemented as so in the brain of the chatbot, but such enhancement of the user’s experience may be appropriate and desired within a particular domain of interest.

3.1 Personality

The final user experience consists not only of the correct answers, but also the style of the conversation [17, 18]. In modern culture, internet conversations make heavy use of emojis, commonly known visual running gags (“memes”), internet slang and so on. Users often type quick, short messages and it is not uncommon to write a single thought across a few messages. We may therefore design the bot’s personality in a way that makes it less robotic and more appealing to the user. Slackbot [22] is a great example of a bot that has gained wide acceptance in the users community, as it provides a near perfect blend of functionality, convenience and personality (Fig. 3).

Fig. 3.
figure 3

Slackbot onboarding [22].

Some of the techniques used to provide personality to a chatbot are:

  • simulating human internet conversations according to the chat platform itself, for instance by using emojis to communicate reactions and thoughts,

  • delaying the replies by a very short time in order to simulate the typing process (possibly with a floating “three dots” message, similar to the ones that are displayed when having a regular conversation in popular applications),

  • using the context of the conversation to add basic human traits, like remembering a problem that was solved before or introducing changes to the conversation style according to the conversation style of the user.

Obviously, some applications of chatbots will require a much stronger personality (like the volatile AI in the aforementioned Event [0] game, where an emotion matrix is utilized in order to activate and deactivate certain dialogue options during gameplay), but the user’s experience of even a basic chatbot used for scheduling meeting can be easily enhanced with simple techniques like clever scripting and clear interface design.

4 Proposition of Reply Scenario Interface Model

As stated before, in the classic approach, the chatbot’s brain defines scripts for its conversation capabilities. The scripts vary according to the used technology and complexity: for instance, environments like ChatScript use ontologies and allow to define and group the scripts into topics, concepts and highly advances sequences. This is further used to better interpret the user’s input and provide a very human-like answer.

Specific domain chatbots [14] with business applications, on the other hand, rely on the following general algorithm:

  1. 1.

    read user input,

  2. 2.

    identify user’s intent and the corresponding entities,

  3. 3.

    if needed, ask for any missing parts of the query,

  4. 4.

    gather data necessary to build answer,

  5. 5.

    build answer according to data retrieved from the bot’s knowledge base,

  6. 6.

    if suitable, propose further actions.

In order to meet the aforementioned requirement of speed, we propose the following communication model. Let’s assume:

  • \(I_{i}\) - a single intent, which defines the basic actions of the chatbot (“get”, “show” etc.) and defines the particular words corresponding to that intent.

  • \(E_{j}\) - a single entity of chosen detail (“client’s number” etc.). Defines the particular words corresponding to that entity,

  • \(I = I_{1} \cup I_{2} \cup ...\) - the set of intents known to the chatbot,

  • \(E = E_{1} \cup E_{2} \cup ...\) - the set of entities known by the chatbot,

  • P - the set of parameters needed to answer the query (most commonly, numbers and strings, stored in different data sources or provided by the user);

Therefore, each specific topic of conversation will be described by a separate scenario, as the chatbot needs to reference the particular data source suitable for answering the particular question or performing a given task. Each scenario \(S_{x}\) will be therefore defined by a pair of \(I_{i}, E_{j}\). For the needs of this particular proposition, the processing methods needed to transform the input into the said pair will not be described, as many different algorithms of keyword extraction can be used.

The scenario should therefore define:

  1. 1.

    methods of acquiring P,

  2. 2.

    data D needed for building the answer,

  3. 3.

    the initial building blocks for the answer A,

Acquiring P is the crucial issue in designing the chatbot’s interface and interaction model - the main purpose and requirement is speed, and we assume that the users do not have time to learn the questions formats that are “understood best by the chatbot”. Instead, we want the chatbot to recognize the user’s intent as fast as possible upon short input messages and then ask for any additional data, according to the way a regular conversation with a human being would occur.

Because the intent and entity clearly define the chatbot’s action, we can acquire each of the parameters \(P_{x}\) by small reply messages within the scenario. Each of this messages asks the user for an additional piece of relevant data. The set of available reply messages, \(R_{P}\) can also be defined by the designer, as they will be common for many of the scenarios (“Please provide the date”, “Please choose topic of complaint” etc.). Each of the replies existing in the model may be then translated into the controller and view of the system, equipped with convenient, specialized, user friendly UI components. The retrieved parameters are then used for answer-specific data retrieval mechanisms, according to the knowledge base implementation. Finally, the answer is built out of A, completed with the data retrieved from the knowledge base.

The structure of this model may be demonstrated as in Fig. 4.

Fig. 4.
figure 4

Communication provided by the scenario-based model

5 Conclusions

In this paper we have discussed and analyzed the requirements and design principles for modern business applications of chatbots within graphical interfaces. We have also proposed a flexible model of communication and conversation designed for the particular needs of closed domain chatbots within such interfaces. As the market and interest in applying chatbot technology rises [23, 24], we believe that natural language will eventually become a key part of our communication with artificial intelligence. We are also positive that the development of natural language technology will inspire the development of UI design. Certain functionalities will be resolved with the introduction of a conversational interface, allowing the graphical elements to focus on the more difficult aspects of data input, organisation and presentation.