Keywords

1 Introduction

Internet users increasingly interact with others through new and advanced forms of online communication. Compared to face-to-face communications, participation in these activities is often large-scale, anonymous, asynchronous, and open to any Internet user or registered community member. In addition, users can choose when to join and when to leave the communication, and may have quite heterogeneous demographic information and varied domain expertise and professional background. Because of these characteristics, it can be challenging for individual users evaluate the others’ ideas and keep track of the others’ perspectives and justifications in these Online Open Participative (OOP) environments. One approach to address this challenge is to first automatically identify statements that contain one’s rationales from the communication record and then present them to the participants to raise their awareness of these rationale statements. These rationale-containing statements are argumentative discourse units that include rationales and some limited context around them. Various studies are conducted to explore how to detect them automatically [15, 23]. For example, a few studies have focused on examining discourse relations that are commonly present in rationale-containing statements [3, 10, 11, 24].

In this study, we explore the existence of authority claims in rationale-containing statements. Authority claims are statements made by a discussion participant aimed at bolstering their credibility in the discussion [2]. We speculated that people tend to make such statements when they provide their justifications in OOP environments because they have little knowledge about the others they communicate with, the participation is open with little or no background check or requirement, and there are many participants in a discussion context. In addition, OOP environments often have little or no non-verbal cue that helps them establish their credibility. If we do discover strong correlations between rationale-containing statements and authority claims, then the detection of authority claims may contribute to the detection of rationale-containing statements.

We annotated authority claims in the rationale-containing statement datasets from [21, 24]. In the following sections, we present in details our annotation process and results, and then discuss the implication of our findings and our next step.

2 Our Datasets

We leveraged the rationale-containing statements from [24]. [24] obtained five substantial data sets from Rutgers’ argument mining group. Each data set consists of text segments from a blogpost at Technorati (technorati.com) between 2008–2010 and its first 100 comments. These five blogpost datasets are about different issues. Specifically, Android and iPad datasets are about the user interface and usability of the android device and iPad. Ban dataset is about the ban of sharing music on social media. Layoff is about the layoff and outsourcing in the United States. And Twitter is about the Twitter as a social media tool. Rutgers’ researchers had human experts and Amazon Mechanical Turkers annotate the blogposts and the comments to identify two types of text segments: targets and call-outs. According to their annotation guidelines [21], a target is a prior action that a call-out responds to or comments on in some way. A call-out includes one or both of the following: (a) explicit stance (indication of attitude or position relative to the target), and (b) explicit rationale (argument/justification/explanation of the stance taken). With these datasets, [24] analyzed the call-outs in these datasets to identify those that contain rationales. They then annotated the discourse relations in these rationale-containing statements using rhetorical structure theory (RST) [13] and identified ten common discourse relations in the statements.

In our study, we reviewed the rationale-containing statements from [24] and filtered overlapping sentences. We then proceeded to annotate the authority claims.

3 Annotation of Authority Claims

Authority claims are statements made by a discussion participant aimed at bolstering their credibility in the discussion [2]. According to [2], a writer may use various strategies to bolster one’s credibility, e.g., external credible source, common sense, and personal experiences. Our analysis examined whether the statement reveals that the writer had an intention to bolster his/her credibility in making the statement. We are interested in the writer’s intention because we speculated that there would As pointed out in the Introduction section, we speculated that participants would feel a need to bolster their credibility when giving their claims and rationales in OOP environments.

We give three examples to illustrate our annotation focus. Consider this statement: Apples are good for your health. They are extremely rich in important antioxidants, flavonoids, and dietary fiber. The first sentence is the writer’s claim. From the writer’s perspective, it is relatively clear that the function of the second sentence is to provide an explanation of the claim, whereas its function to bolster the writer’s credibility in making the claim is not evident. In our analysis, we did not consider this statement to contain authority claim.

Consider this second example: I remember Apple telling people give the UI and the keyboard a month and you’ll get used to it. Plus all the commercials showing the interface. So, no, you didn’t just pick up the iPhone and know how to use it. It was pounded into to you. In this statement, the last two sentences reflect the writer’s claim – “No, you didn’t just pick up…It was …” The first two sentences provide rationales to this claim. To us, the act of adding “I remember” to the utterance hints the writer’s intention to bolster his/her credibility. Also, by saying that “all the commercials…” the writer bolsters the credibility using the common sense strategy explained by [2].

The third example uses this statement: On the other hand, from what I’ve seen with Android, it’s not so much the differences in the UI, it’s the inconsistency from one part of the UI to another. It’s the classic Linux desktop problem. It’s so open that everything on it has its own way of working and interacting. To us, the key phrase for annotating the authority claim in this sentence is “from what I’ve seen with Android”. The statement is still grammatically correct and the core meaning does not change. The addition of this phrase reflects the writer’s intention to bolster his/her credibility by emphasizing the personal experience.

Before the annotation, we first separated sentences in the text based on the punctuation marks - sentences are separated by period, exclamatory mark, question mark, and suspension points. While the intended or actual influence of an authority claim may span multiple sentences, it is the occurrence of the bolstering intention in a sentence that is considered in our study hence our analysis was at the sentence level. The following statements are considered individual sentences according to this rule. While suspension points are often ellipses, there are cases in which users used other punctuation marks like hyphen in the last example below.

  • But once one scaled the usage up, the number of windows open to reach a specific file exploded.

  • I took Nexus on a trial basis for a week and have decided that it’s a much better fit for a peculiar audience, primarily MIT engineers.

  • I have a macpro and a macbookpro, and on my MPB, I run Win7 as the default OS because I find it more intuitive and easy

Our annotation process was iterative. At the beginning, the two authors met and discussed the authority claim concept and annotated a snippet of the data independently (their academic background: the first author majors in Information Science, and the second majors in Computational Linguistics program at Department of Linguistics). They then exchanged the analysis results and discussed the differences. After two rounds of this process, they finalized the criteria of an authority claim. The second author then annotated the rest of the data. Intra-coder reliability measure [20] was used to calculate the reliability of the second author’s annotation work. Specifically, he annotated the data again after two weeks and we compared the two results using Krippendorff’s Alpha [6]. We calculated this value using a sentence as the unit. That is, we identified total number of sentences and the total number of authority claim sentences in two annotation exercises and compared the results.

This reliability check showed good agreement between the two annotation results for all five datasets, as shown in Table 1.

Table 1. Number of rationale-containing callouts in our dataset.

In the subsequent semester, we recruited and trained a graduate student majoring in Information Management to annotate authority claims based on our coding instruction, and had her annotate the datasets again. We compared and discussed her annotation result with the second author’s and achieved agreement in the end.

4 Results and Discussion

As shown in Table 1, our dataset contains 271 rationale-containing call-outs and 1,124 sentences. We only have 42 authority claim sentences which is only about 3.7% of the data. This result is contradictory to our speculation. In other words, the participants in our dataset scarcely attempted to bolster their credibility when presenting their reasoning to the others.

There are several possible explanations to this finding. First, the datasets we examined are online blogs and the comments below them, not content of online deliberations or debates. Therefore, the participants’ intention may be more so of expressing their views than persuading others to agree with their views. So they were little concerned about whether the others would view them credible or not. To examine whether this explanation is valid, we are annotating online Reddit discussions to explore the existence and percentage of authority claims in their arguments. We also plan to explore other online debate dataset such as Internet Argument Corpus [22].

Second, it is possible that while the participants had the intention to persuade others they used different persuasion strategies. Commonly defined as “human communication that is designed to influence others by modifying their beliefs, values, or attitudes” ([19], p. 21), persuasion can appear in various forms in communication record through different strategies. For example, Aristotle’s work focuses on the speaker’s acts to make the comment persuasive and is commonly adopted by the related communities [8]. In Aristotle’s view, persuasion depends on the credibility of the speaker (ethos), the emotions of the audience (pathos), as well as on the cogency of the arguments employed and their ability to show our claims to be correct (logos). Authority claims reflect the use of ethos in persuasion strategies. Therefore, it is a possibility that the participants in our datasets used other persuasion acts more than ethos. Interestingly, a recent annotation study [7] also shows that out of the three persuasion modes ethos were used least when participants offering the premise for their claims in an online persuasive forum. In that study, the authors annotated 78 discussion threads that include 278 turns of dialogues. These consist of 2,615 propositions in 2,148 total sentences. Of these sentences, 1,068 contain a premise and only 3% of these premise sentences contain ethos. We also note that [7] annotated the text at the sentence level as well.

Third, it has been shown that we behave differently in online social activities than offline social interaction [17, 18]. It is possible that the type of communication medium affects how one’s credibility is established in communication or how people reason. For example, in online communications, the user’s credibility or authority may be established through other information channel in the environment such as the user’s profile. If this is the case, it is perhaps not sufficient in argument mining research that only depends on existing theoretical frameworks on argumentation and reasoning which are mainly based on face-to-face communication and interactions. Further investigation is needed that helps us better understand the effects of communication medium on argumentation process at individuals’ cognitive and meta-cognitive level.

5 Conclusion

Internet users increasingly interact with others through these new and advanced forms of online communication. Many of these interactions involve complex processes of persuasion and influence [9, 14]. Researchers explore computational techniques to automatically identify components of participants’ arguments in these activities, such as their stances [1, 16] and justifications and supporting statements [4, 12].

In this research program, we explore indicators of one’s reasoning traces focusing on one’s rationales. Our objective in this annotation study is to examine whether the existence of authority claims is an indicator of one’s rationale places in OOP environments. Our work is preliminary both in terms of the size of the data and the communication context. We are exploring the occurrence of authority claims in larger datasets and how the online activity’s context correlates with the occurrence. Interestingly, research study of scientific discourse has overlapping work with ours, e.g., the knowledge attribution and epistemic evaluation model in scientific discourse by De Waard and Maat [5]. One of our future work is to compare the findings from social media content and scientific discourse to further explore the contextual factor on the choices of the argumentation strategies.