Keywords

1 Introduction

Deliberative model of online consultations for local governments is being prepared within the frames of the project ‘In Dialogue’Footnote 1. The goal of the project is to develop an internet platform supporting public consultations, whose participants are city halls and citizens. The internet platform is envisioned as a multifunctional tool allowing for online debates of different types: synchronous textual debates, asynchronous textual debates, and synchronous voice debates. An argument mapping tool is supposed to support textual debates mainly by ordering the arguments presented throughout the discussion as well as providing the visualization of the arguments and relations between them. Not only does such a tool help citizens but also it serves moderators during discussion and summary report creation.

It is easy to lose track of the main course of the discussion, especially when the discussion is longer with many people engaged. At the end of the discussion it is not always clear what arguments led to the conclusion, if such was reached. If the results of the debate are important, and this is the case in the ‘In Dialogue’ project, we would like to be able to backtrack to the arguments presented during the deliberation. Finding the arguments requires reading the whole script (potentially several times), which is a time consuming and daunting task. Isolating arguments and presenting them visually simplifies the analysis of arguments appearing in the discussion.

In this paper we propose a framework for automatic argument extraction and visualization for the purpose of the ‘In Dialogue’ project. The framework utilizes methods from the fields of artificial intelligence, natural language processing and data mining.

The rest of the paper is organized as follows. Section 2 presents related work concerning argument visualization and argument extraction methods. Section 3 introduces a framework for automatic argument extraction and visualization. Section 4 concludes the paper.

2 Related Work

This section summarizes approaches reported in the literature concerning methods of extracting and visualizing arguments.

2.1 Argument Visualization

In [23] argument visualization is characterized as being related to debating among many individuals or parties and constituting a presentation of reasoning in which the evidential relationships among claims are made wholly explicit using graphical or other non-verbal techniques. What is more, the structure should allow for reasoning involving propositions standing in logical or evidential relationships with each other, and thus forming evidential structures.

There exist different argument visualization methods. Most assume the use of boxes and arrows, though their usage differs between systems that implement them. Boxes usually contain full, grammatical, declarative sentences being [1]: reasons (pieces of evidence in support of some claim), claims (ideas which somebody says are true), contentions (claims supported by reasons) or objections (pieces of evidence against contentions). The correct way to map the argument is to display the reasoning, i.e., boxes contain claims, not whole arguments. The boxes are linked with lines/arrows representing relations allowing for reasoning. Other forms of argument visualization are argument matrices and argument threads. In argument matrices rows and columns represent argument components and cells represent relations between the components. Argument threads allow to capture debate results in a compact form and thus help understand what has been discussed with no need to duplicate existing arguments [13].

Many software systems have been built to date to support argumentation and argument visualization. One of the possible categorizations of these tools is division according to the number of users into [16]: single user argumentation systems (aimed at individuals to structure their thoughts, e.g., Carneades), small group argumentation systems (useful for developing argumentation skills, learning skills of persuasion, e.g., Belvedere), and community argumentation systems (supporting large groups of participants and contributions; visualization uses discussion/argument threads aside from graphs, e.g. DebateGraph, Collaboratorium). We will briefly outline argument visualization methods in the most representative systems for each group.

Visualization in Carneades [5] is based on the Carneades Argumentation Framework (CAF) [10, 11]. CAF is built upon a formal, mathematical model of argument evaluation applying proof standards to determine the defensibility of arguments and the acceptability of statements on an issue-by-issue basis.

An argument graph constructed in Carneades plays a role comparable to a set of formulas in logic. There are two kinds of nodes in the graph: statement nodes and argument nodes. The edges of the graph link up the premises and conclusions of the arguments. Each statement is represented by at most one node in the graph. An example of an argument graph representing arguments from the law domain is given in Fig. 1.

Fig. 1.
figure 1

Arguments and argument graph in Carneades [10]

Pro arguments are indicated using ordinary arrowheads, con arguments with open-dot arrowheads. Ordinary premises are represented as edges with no arrowheads, presumptions with closed-dot arrowheads and exceptions with open-dot arrowheads. The direction of the edge is always from the premise to the argument. A statement may be used in multiple arguments and as a different type of premise in each argument. Cycles are not allowed.

Belvedere is a multiuser, graph-based diagramming tool for scientific argumentation. It is used for argument representation and visualization [16]. The tool uses ontologies and provides feedback. Users of Belvedere are required to categorize their statements as data, hypothesis or unspecified. The statements are then linked using relations of type: for, against, or unspecified. The system uses a simplified ontology containing two distinctions: empirical vs. theoretical, consistency vs. inconsistency [20]. The system visualizes the argumentation in the form of a graph and a matrix. A sample argument visualization in the form of a graph and a matrix is given in Fig. 2(a) and (b), accordingly.

Fig. 2.
figure 2

Argument visualization in Belvedere in the form of: (a) a graph, (b) a matrix.

The matrix representation organizes hypotheses (or solutions) along one axis, and empirical evidence (or criteria) along another, with matches between the two being expressed symbolically in the cells of the matrix.

Deliberatorium [13] is a tool that helps structure large-scale argumentation (such as wikis, blogs and discussion forums), which is useful in situations where many people express their views on a problem at hand to find the best solution. The system authors argue that commonly used technologies have serious shortcomings in deliberative environment: lack of systematicness and repetitiveness, which make it hard for users to locate useful information. Forum latecomers cannot see all important arguments in the discussion and have a good understanding of the whole discussion possibly having many digressions and off-topics unless they read it from the beginning, which is very rarely practiced.

To mitigate shortcomings of the existing methods the Collaboratorium system makes the deliberation evident by grouping and structuring the argumentation. Users of the system belong to one of the three groups: moderators, authors and readers/voters. Moderators are in charge of filtering out noise and rejecting off-topic posts, as well as making sure the argument map is well structured, i.e., all posts are properly divided into individual and non-redundant issues, ideas and arguments, and are located in the relevant branch of the argument map.

The system functions in the following manner. Authors post issues, ideas and pro/con arguments. Issues and ideas are posted only as single, short sentences. Arguments are posted using an online form consisting of a scheme containing conclusion and grounds. All users of the system can rate arguments and ideas. New posts are given a status of “pending” and only moderators can accept them which results in publication. The point of such a procedure is to limit bad or provocative posts triggering low-value discussion threads. A sample resultant argument map is presented in Fig. 3.

Fig. 3.
figure 3

Sample on-line argument map generated in Deliberatorium [13]

The tools commonly used in argumentation visualization require training, lots of time and effort to produce the final visualization. As [2] points out: “a trained analyst can take weeks to analyze one hour of debate” in order to make its visualization. That is the reason we do not plan to make detailed visualizations of complete debates. Our intention is to extract the main topics in the discussion and the main arguments pro and contra. The idea is rather to help many participants (potentially hundreds of people) of a long debate become familiar with the main topics in the discussion than to draw a detailed visualization of the complete debate.

2.2 Approaches to Automatic Extraction of Argumentation Components

The approaches to argumentation visualization/mapping presented in the previous section are manual: one has to feed the visualizing component with manually extracted elements, such as: statement, premise and argument (in the case of the Carneades system), data, hypothesis or unspecified (in the case of the Belvedere system), or issue, idea, argument (in the case of the Deliberatorium system). It would be desirable to automate the process of extracting argumentation components necessary to realize the visualization. One way to achieve this is by using argumentation mining.

Argumentation mining is a relatively new challenge in discourse analysis [3, 12]. It can be defined as such discourse analysis that involves automatic identification of argumentation within a document, i.e., the premises, conclusion, and type of each argument, as well as relationships between pairs of arguments in the document [12]. In the literature one can find approaches to argumentation mining, that are promising, i.e. they achieve a good level of success, but still there is a considerable gap dividing them from becoming production systems. Argumentation mining approaches presented in the literature are mostly intended for analysis of official documents (such as legal cases), customer reviews of consumer products (such as reviews available at Amazon.com), or for automatic analysis of debates (such as debates available e.g. at Debatepedia.org).

Argumentation mining methods reviewed below are used to identify arguments in text and their polarization (positive, negative), as well as relations between arguments.

In [3] the authors consider the problems of identification of the illocutionary force of individual units and identification of relations between units. They specify three features of dialogue context allowing for dialogical argument mining: ‘(i) illocutionary forces, (ii) indexicality of locutions, i.e. locutions in which illocutionary force or propositional content cannot be identified without considering moves that precede a given indexical locution, (iii) transitions between dialogical moves that anchor forces of indexical locutions’. The partial implementation of an argument mining system with the assumed specifications is realized using TextCoop platform. The purpose is to show that a dialogue can be decomposed into meaningful dialogue text units using a dedicated grammar that can identify and delimit such units and how an illocutionary force can be assigned to each of these units. The conducted tests showed an 85 % effectiveness for the first task and 78 % accuracy for the other. The results of the task of anchoring illocutionary forces to transitions have been reported in the paper to be under implementation.

[9, 12] posit that argumentation mining would benefit from dedicated corpuses possessing annotations such as: data, warrant, conclusion and argumentation scheme of each argument; multiple arguments for the same conclusion; chained relationships between arguments.

[25] considers a semi-automated approach to argumentative analysis. The authors take into consideration arguments present in online product reviews, and in particular reviews taken from Amazon.com, concerning a selected model of a digital camera. The approach consists of five layers of analysis: a consumer argumentation scheme - CAS (dedicated to buying a camera and built of related to that activity premise and conclusion schemes), a set of discourse indicators (indicators of premise, e.g.: after, as, for, since; conclusion, e.g.: therefore, consequently; contrast, e.g.: but, except, not), sentiment terminology (from highly positive to highly negative), a user model (user’s parameters: age, gender, etc.; context of use; constraints: cost, portability, etc.; quality expectations), a domain model. To find these components the authors used GATE, JAPE, and ANNIC open source tools. The corpus is iteratively searched for properties instantiating the argumentation scheme, identifying attacks. After gathering instantiated arguments in attack relations the argumentation framework is evaluated. The premises instantiate the CAS in a positive (for buying the camera) or negative (against buying the camera) way.

Argumentation structure detection has been reported in [4]. It bases on calculating textual entailment to detect support and attack relations between arguments in a corpus of online dialogues from Debatepedia stating user opinions. To detect the relations an EDITS system is used. The approach is two-step: assignment of relations to the data set (0.67 accuracy was reported); how bad assignment influences evaluation of the accepted arguments (mistakes in the assignment propagate, but the results are still satisfying).

In [24] the authors argue that many evaluative expressions with a heavy semantic load are in fact arguments and that the association of an evaluative expression with the discourse structure must be interpreted as an argument. The authors develop a global semantic representation for these constructions and perform tests using the TextCoop platform. The reported tests show high effectiveness of discovering discourse relations (justification, reformulation, illustration, precision, comparison, consequence, contrast, concession) in terms of precision and recall. The goal of discovering these relations is to determine why consumers or citizens are happy or not with a given product or decision. The authors observe that to be able to automatically synthesize any text in the proposed manner a very rich semantic lexicon and a set of inferential patterns are needed.

[22] focuses on finding argument-conclusion relationships in German discourses. They follow an approach consisting of the following steps: manual discourse linguistic argumentation analysis (the aims of this stage are discourse relevant arguments identification, formation of argument classes and determination of significance of an argument in the discourse), text mining (PoS tagging and linguistic annotation, polarity detection), data merge. The results of the analysis are words indicating argument-conclusion relationship (such as because, since, also, …). The words, however, do not indicate where the argument or conclusion starts or ends and additional steps are required to identify the extent of these, e.g. text windows left and right to the conclusive.

3 Automatic Argument Extraction and Visualization Framework

While the solutions presented in Sect. 2.1 assume indication of thesis, solution, proposalFootnote 2, and arguments for or against them, they do not take into account any automatized support in creating a structure of proposals and arguments. Approaches presented in Sect. 2.2 try to systematize the process of dividing a text document of various domains into structured parts. However, they do not show how to maintain the whole process, from extraction through additional transformations and relations assignments to storage and visualization of arguments, proposals of a debate. In this section we analyze the possible support of Artificial Intelligence (AI) in argument mapping and propose a framework that employs AI techniques in order to reduce human workload.

Results of debates may be of different types, however, they have at least one common property, they are of unstructured form, e.g., script of an online or direct debate. In order to transform unstructured text into structured relations of proposals and arguments, we propose to employ Text Mining/Natural Language Processing (NLP) algorithms that automatically extract proposals and arguments placed in text. Moreover, we show how to connect the arguments with relations and store results in a flexible manner.

3.1 Overview of the Framework

Our framework aims at reducing human workload in creation of a structured representation of a debate. Thus, we consider the whole process of unstructured text transformation into debate results stored according to a knowledge representation.

Figure 4 shows the framework for automatic extraction of proposals and arguments, their simplification, transformation with accordance to a knowledge representation and visualization. The source of data is an unstructured text, e.g., a script of an online or direct debate, or an unstructured forum. The input is processed by means of Text Mining/NLP techniques in order to extract proposals and arguments and their relations. Optionally, proposals and arguments may be verified and changed by a human at this stage. The next step is to transform proposals and arguments to obtain simpler, more informative or combined into one if e.g. two or more arguments are the same. After transformation human interaction may also be performed. Then proposal and arguments are stored in a knowledge base according to a knowledge representation. All aforementioned steps are described in details in the following sections.

Fig. 4.
figure 4

Framework for automatic argumentation mapping and visualization

3.2 Proposals and Arguments Extraction

In order to help human in making results of a debate structured, algorithms for automatic extraction can be employed. They use supervised approach, hence we need to collect a corpus which contains annotated debate scripts; that is, scripts with tagged proposals and arguments. We plan to use scripts of the debates that are being run within our project and annotate them. Having the corpus collected, we need to transform text data to be used in extraction algorithms. The first step is preprocessing, which consists of stop words removal, stemming or lemmatization, and transformation of letters to lower case. Then we divide a corpus into: training, validation, and test sets. Training set is used to learn a model, validation set to choose the best parameters for a model, and test set is used in final evaluation of a model.

Recently Conditional Random Fields (CRF) [7, 14, 26] models are the state-of-the-art in information extraction by sequence labeling. Linear-chain Conditional Random Fields model is often used [21] in sequence labeling. It assumes that text is a sequence of tokens that have a label assigned to each token; that is, proposal, argument, none in our case. A token is characterized by neighboring tokens and other features that are based on these tokens. We propose to use lemmatized tokens, their part of speech tags, argument introduction words; that is, words that introduce arguments, e.g., because, as a binary indicator and proposal introduction words, e.g., propose, solution.

Trained linear-chain CRF is used to predict proposals and arguments. Moreover, the number of labels can be increased to distinguish positive-argument and negative-argument. The procedure of training and using linear-chain CRF modes stays the same. We can also predict whether an argument is positive or negative by using one of sentiment analysis methods described in [8, 15]. Most recently models used in sentiment analysis, that are worth to be mentioned and used, are deep neural networks models [19].

As an example we can consider the following part of a debate script: ‘The location of a primary school is really important. I propose to build the school at Markan street, because many people living nearby could send their children to that school.’ It consists of a proposal and an argument. The CRF model would annotate the aforementioned example with the following entities: ‘The location of a primary school is really important. I propose to <proposal begin> build the school at Markan street <proposal end>, because <argument begin> many people living nearby could send their children to that school <argument end>.’

If the corpus contains annotated relations between proposals and arguments, we may also predict relations that connect arguments with their proposals and even sub-arguments and arguments. Assuming that extraction of proposals and arguments is done at the beginning, we create a classifier that predicts whether an argument is related to the proposal. To this end an SVM classifier may be employed based on bag of word features calculated for words of a considered proposal and argument, or argument and sub-argument.

The solution presented herein is language dependent in a sense that the main steps of it do not depend on language, however, their implementations are different for various languages, e.g., for English we need to use English stemmer and for Polish we use a stemmer developed for this language.

The aforementioned CRF and SVM models predict proposals, arguments as well as relations between them and create a structure presented in Fig. 4. (depicted in the second block and obtained by extraction). It may be verified and improved by a human to ensure the high quality of extraction.

3.3 Proposals and Arguments Transformation

Having proposals and arguments extracted, within our framework we perform their transformation to simplify them and make them shorter; that is, more dense in the sense of carried information. Simpler and containing aggregated information proposals and arguments let a user understand ideas behind a debate more easily. To this end, we propose to calculate semantic similarity between two arguments or two proposals and decide whether they are semantically equivalent. If so, we choose the shorter one and remove the longer proposal/argument. Children of a removed proposal/argument, e.g., arguments that are related to a proposal, are attached to the shorter proposal/argument and then checked against the semantically equivalent with other proposals/arguments.

Semantic equivalence may be calculated using recursive autoencoders [18]. This algorithm uses word embedding vectors to represent words and trains recursive autoencoders to represent a sentence. On top of these vectors a classifier is built in order to judge whether two sentences are semantically equivalent. Despite the fact that recursive autoencoders are trained in an unsupervised manner, the classifier needs annotated corpus. However, the classifier may be reduced to simple similarity of vectors and a given threshold to avoid the need for an annotated corpus.

Let us consider as a transformation example that there are two arguments discovered by the CRF model: (i) many people living nearby could send their children to that school; (ii) in this location there are so many blocks of flats that many residents would be happy to have a school nearby. The algorithm compares these two arguments and decides that the semantic similarity is high, thus the second argument can be removed and the first argument, as the shorter one, is left.

This step may also be verified by human in order to assure high quality of the results.

3.4 Proposals and Arguments Storage

As we have proposals, attributes, and the relations extracted, we need to store them in a knowledge base according to a knowledge representation. In our framework we use ontology to model proposals, arguments and their relations. The main two concepts are: proposal and argument. We add extracted proposals, arguments, as instances of concepts proposal, argument, respectively. We also model relations between them by introducing is-argument-of-proposal and is-sub-argument. Moreover, positive and negative arguments are indicated by properties. Modeling proposals, arguments and their relations by means of ontology is motivated by its flexibility. We can add additional relations, e.g., connecting arguments that are semantically similar to some extent, however, have not been merged. Moreover, we can define relations, for instance, sibling-argument, and use inference to connect instances that are coupled by these relations. This ontology structure can be transformed to the form of the AIF ontology’s Argument Network [6] to enable integration with external visualization tools.

Independently of existing tools the ontology of proposals and arguments can be easily visualized to support the process of debate analysis (please refer to Sect. 3.5).

3.5 The System

Currently in our system we implemented structure for proposals and arguments storing. We also designed and implemented visualization module. The example visualization is shown in Fig. 5. It contains one additional element compared to these considered in Sect. 3.1, namely a question (represented by the most left-up rectangle without rounded corners). During a debate several questions can be asked in order to get to know opinions of residents on a given topic. Each question has its own proposals (blue rectangles with rounded corners) that is followed by positive (green) or negative (red) arguments. Arguments may have their own arguments (e.g., negative red argument). Our framework will be empirically verified as we currently implement extraction module based on SVM classifier and Conditional Random Fields. Next step is the transformation of proposals and arguments and storage. All modules will be verified on real debates scripts that are being gathered during consultations run within the project. Moreover, the system can be verified during real consultations as it serves as automation of the process. After the automated part a clerk verifies the proposals and arguments according to the debates script. Positive results of this verification will prove the usefulness of our system. The debates are conducted in Polish, however, we are going to support English in order to verify the applicability of our framework for different languages.

Fig. 5.
figure 5

Proposals and arguments visualization implemented in the system. (Color figure online)

4 Conclusions

In this paper, we investigated the task of automatic extraction and visualization of proposals and arguments in the context of online consultations conducted by local governments. Firstly, we discussed approaches to visualization and (semi–)automatic extraction of proposals and arguments presented in the literature. Secondly, we proposed a framework allowing for automatic extraction of arguments from deliberations. The proposed framework assumes extraction of arguments and argument proposals, sentiment analysis to predict whether argument is negative or positive, and classification to decide how the arguments are related. Moreover, the framework facilitates the transformation of proposals and arguments (simplification and clustering) in order to combine those that are semantically equivalent and, in consequence, to help participants and moderators by simplifying the analysis of a debate. Meaningful and simplified proposals and arguments are stored according to a knowledge representation method, ontology in our case.

The proposed framework is currently under implementation within the project ‘In Dialogue’. Storage for proposals and arguments has already been implemented. Furthermore, we presented the example visualization of proposals and arguments provided by our implemented module. The system will be used in the process of summarizing debates. It will automatically prepare a summary by extracting proposals and arguments and their relations. A clerk who needs to summarize a debate will have an automatically created logical structure of a debate that he/she only needs to refine. The system will reduce time spent on summary preparation.

As the consultations run within the ‘In Dialogue’ project are conducted in Polish, we implement the proposed framework for this language. However, we also plan to support English. In order to prepare the system to process English, we will need to apply language dependent NLP components, like stemmer, part of speech tagger, named entity recognizer. We will also need to retrain CRF, SVM models and recursive autoencoders on English training sets.

The results of the consultations being run within the project will be used in our framework in order to train supervised models. Then, the framework will be tested in forthcoming consultations and, in the end, it will be made available for local governments in order to support citizens and moderators during consultations.