Abstract
This chapter presents a rich annotation scheme for mentions, co-reference, meronymy, sentiment expressions, modifiers of sentiment expressions including neutralizers, negators, and intensifiers, and describes a large corpus annotated with this scheme. We define the various annotation types, provide examples, and show statistics on occurrence and inter-annotator agreement. This resource is the largest sentiment-topical corpus to date and is publicly available. It helps quantify sentiment phenomena, and allows for the construction of advanced sentiment systems and enables direct comparison of different algorithms.
Work was conducted while both authors were at, J.D. Power and Associates Web Intelligence, McGraw Hill.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We draw the distinction between the immediate target of a sentiment expression and a document-level topic. Other work, such as [27], has addressed the problem of developing topic-dependent feature-sets for supervised classification of document-level polarity.
- 2.
Called “negatives” in [29].
- 3.
The TimeML corpus [30] has explicit annotations for counter-factive events and treats negation as a property of an event. We believe that both act the same way w.r.t. contextual polarity.
- 4.
Reference [31] presents a corpus containing “certainty markers”, or expressions indicating commitment to a sentence or a clause and its level of certainty, on a scale from uncertain through absolute certainty. Our committers are judged on a binary scale: do they raise or lower the author’s commitment to a sentiment expression or modification.
- 5.
The problem of determining when an event is asserted as true, false or unknown truth-value is called veridicity [16]. [18] has developed a rule-based system for recognizing the veridicity of some clauses which is tailored to the blogosphere and has released a lexicon which includes “neutral veridicality elements” which neutralize their argument clauses.
- 6.
Discussion of descriptors is omitted due to space constraints. See the annotation guidelines [10] for details about this annotation.
References
Asher, N., Benamara, F., Mathieu, Y.Y.: Distilling opinion in discourse: a preliminary study. In: Coling 2008: Companion volume: Posters, pp. 7–10, Coling Organizing Committee, Manchester, UK (2008)
Bloom, K.: Sentiment analysis based on appraisal theory an functional local grammars. Ph.D. Dissertation, Illinois Institute of Technology (2011)
Breck, E., Cardie, C.: Playing the telephone game: determining the hierarchical structure of perspective and speech expressions. In: COLING (2004)
Breck, E., Choi, Y., Cardie, C.: Identifying expressions of opinion in context. In: IJCAI (2007)
Brown, G.I.: An error analysis of relation extraction in social media documents. Proceedings of the ACL 2011 Student Session. HLT-SS ’11, pp. 64–68. Association for Computational Linguistics, Stroudsburg, PA, USA (2011)
Choi, Y., Cardie, C.: Learning with compositional semantics as structural inference for subsentential sentiment analysis. In: EMNLP (2008)
Choi, Y., Kim, Y. Myaeng, S.-H.: Domain-specific sentiment analysis using contextual feature generation. In: TSA (2009)
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20(1), 37–46 (1960)
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: WSDM (2008)
Eckert, M., Clark, L., Lind, H., Kessler, J., Nicolov, N.: Structural sentiment and entity annotation guidelines. J. D, Power and Associates Technical Report (2010)
Fahrni, A., Klenner, M.: Old wine or warm beer: target-specific sentiment analysis of adjectives. In: AISB (2008)
Ginsca, A.-L.: Fine-grained opinion mining as a relation classification problem. In: Jones A.V. (ed.) ICCSW. OASICS, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, vol. 28, pp. 56–61. Germany (2012)
Girju, R., Badulescu, A., Moldovan, D.: Automatic discovery of part-whole relations. Comput. Linguist. 32(1), 83–135 (2006)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: KDD (2004)
Jbara, A.A.: Using natural language processing to mine multiple perspectives from social media and scientific literature. Ph.D. Dissertation, The University of Michigan (2013)
Karttunen, L., Zaenen, A.: Veridicity. In: Annotating, extracting and reasoning about time and events (2005)
Kessler, W., Kuhn, J.: Detection of product comparisons - how far does an out-of-the-box semantic role labeling system take you?. In: EMNLP, pp. 1892–1897. ACL (2013)
Kessler, J.S.: Polling the blogosphere: a Rule-Based approach to belief classification. In: ICWSM (2008)
Kessler, J.S., Nicolov, N.: Targeting sentiment expressions through supervised ranking of linguistic configurations. In: ICWSM (2009)
Kessler, J.S., Eckert, M., Clark, L., Nicolov, N.: The 2010 ICWSM JDPA sentiment corpus for the automotive domain. In: CWSM-DWC (2010)
Kim, S.-M., Hovy, E.: Determining the sentiment of opinions. In: COLING (2004)
Kim, S.-M., Hovy, E.: Extracting opinions, opinion holders, and topics expressed in online news media text. In: ACL Workshop on sentiment and subjectivity in text (2006)
Krestel, R., Witte, R., Bergler, S.: Minding the source: automatic tagging of reported speech in newspaper articles. In: LREC (2008)
Moilanen, K., Pulman, S.: Multi-entity sentiment scoring. In: RANLP (2009)
Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: ACL (2002)
NIST Speech Group. The ace 2006 evaluation plan: evaluation of the detection and recognition of ace entities, values, temporal expressions, relations, and events (2006)
Nowson, S.: Scary movies good, scary flights bad: topic driven feature selection for classification of sentiment. In: TSA ( 2009)
Ogren, P.V.: Knowtator: a protégé plug-in for annotated corpus construction. In: NAACL-HLT (2006)
Polanyi, L., Zaenen, A.: Contextual valence shifters. In: Computing attitude and affect in text: theory and applications (2006)
Pustejovsky, J., Hanks, P., Sauri, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., Lazo, M.: The timebank corpus. In: Corpus Linguistics (2003)
Rubin, V.L.: Stating with certainty or stating with doubt: intercoder reliability results for manual annotation of epistemically modalized statements. In: NAACL-HLT (2007)
Ruppenhofer, J., Somasundaran, S., Wiebe, J.: Finding the sources and targets of subjective expressions. In: LREC (2008)
Shaikh, M.A.M., Prendinger, H., Ishizuka, M.: Sentiment assessment of text by analyzing linguistic features and contextual valence assignment. Appl. Artif. Intell. 22(6), 558–601 (2008)
Su, F., Markert, K.: From words to senses: a case study of subjectivity recognition. In: COLING (2008)
Tsur, O., Davidov, D., Rappoport, A.: Icwsm - a great catchy name: semi-supervised recognition of sarcastic sentences in product reviews. In: ICWSM (2010)
Vaswani, V.: Predicting sentiment-mention associations in product reviews Ph.D. Dissertation, Kansas State University (2012)
Wiebe, J., Mihalcea, R.: Word sense and subjectivity. In: ACL (2006)
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. In: LREC (2005)
Wiegand, M., Klakow, D.: Topic-related polarity classification of blog sentences. In: EPIA (2009)
Wilson, T., Wiebe, J.: Annotating opinions in the world press. In: SIGdial (2003)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT-EMNLP (2005)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)
Wilson, T.A.: Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private States. Ph.D. Dissertation, University of Pittsburgh (2008)
Winston, M.E., Chaffin, R., Herrmann, D.: A taxonomy of part-whole relations. Cognit. Sci. 11(4), 417–444 (1987)
Yu, N., Kübler, S.: Filling the gap: semi-supervised learning for opinion detection across domains. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pp. 200–209. Association for Computational Linguistics (2011)
Acknowledgements
We would like to thank Prof. Martha Palmer, Prof. James Martin, Prof. Michael Mozer at University of Colorado, and Prof. Michael Gasser at Indiana University and Dr. William Headden at J.D. Power and Associates for their helpful discussions. Dr. Miriam Eckert and Lyndsie Clark assisted with an earlier iteration of the corpus description [20].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Kessler, J.S., Nicolov, N. (2017). The JDPA Sentiment Corpus for the Automotive Domain. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_30
Download citation
DOI: https://doi.org/10.1007/978-94-024-0881-2_30
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-024-0879-9
Online ISBN: 978-94-024-0881-2
eBook Packages: Social SciencesSocial Sciences (R0)