The JDPA Sentiment Corpus for the Automotive Domain

Kessler, Jason S.; Nicolov, Nicolas

doi:10.1007/978-94-024-0881-2_30

Jason S. Kessler³ &
Nicolas Nicolov³

2188 Accesses
2 Citations

Abstract

This chapter presents a rich annotation scheme for mentions, co-reference, meronymy, sentiment expressions, modifiers of sentiment expressions including neutralizers, negators, and intensifiers, and describes a large corpus annotated with this scheme. We define the various annotation types, provide examples, and show statistics on occurrence and inter-annotator agreement. This resource is the largest sentiment-topical corpus to date and is publicly available. It helps quantify sentiment phenomena, and allows for the construction of advanced sentiment systems and enables direct comparison of different algorithms.

Work was conducted while both authors were at, J.D. Power and Associates Web Intelligence, McGraw Hill.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 349.00; Price excludes VAT (USA)

Softcover Book: USD 449.99; Price excludes VAT (USA)

Hardcover Book: USD 449.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Sentiment Resources: Lexicons and Datasets

SentiML++: An Extension of the SentiML Sentiment Annotation Scheme

Sentiment Analysis: What’s Your Opinion?

Notes

1.
We draw the distinction between the immediate target of a sentiment expression and a document-level topic. Other work, such as [27], has addressed the problem of developing topic-dependent feature-sets for supervised classification of document-level polarity.
2.
Called “negatives” in [29].
3.
The TimeML corpus [30] has explicit annotations for counter-factive events and treats negation as a property of an event. We believe that both act the same way w.r.t. contextual polarity.
4.
Reference [31] presents a corpus containing “certainty markers”, or expressions indicating commitment to a sentence or a clause and its level of certainty, on a scale from uncertain through absolute certainty. Our committers are judged on a binary scale: do they raise or lower the author’s commitment to a sentiment expression or modification.
5.
The problem of determining when an event is asserted as true, false or unknown truth-value is called veridicity [16]. [18] has developed a rule-based system for recognizing the veridicity of some clauses which is tailored to the blogosphere and has released a lexicon which includes “neutral veridicality elements” which neutralize their argument clauses.
6.
Discussion of descriptors is omitted due to space constraints. See the annotation guidelines [10] for details about this annotation.

References

Asher, N., Benamara, F., Mathieu, Y.Y.: Distilling opinion in discourse: a preliminary study. In: Coling 2008: Companion volume: Posters, pp. 7–10, Coling Organizing Committee, Manchester, UK (2008)
Google Scholar
Bloom, K.: Sentiment analysis based on appraisal theory an functional local grammars. Ph.D. Dissertation, Illinois Institute of Technology (2011)
Google Scholar
Breck, E., Cardie, C.: Playing the telephone game: determining the hierarchical structure of perspective and speech expressions. In: COLING (2004)
Google Scholar
Breck, E., Choi, Y., Cardie, C.: Identifying expressions of opinion in context. In: IJCAI (2007)
Google Scholar
Brown, G.I.: An error analysis of relation extraction in social media documents. Proceedings of the ACL 2011 Student Session. HLT-SS ’11, pp. 64–68. Association for Computational Linguistics, Stroudsburg, PA, USA (2011)
Google Scholar
Choi, Y., Cardie, C.: Learning with compositional semantics as structural inference for subsentential sentiment analysis. In: EMNLP (2008)
Google Scholar
Choi, Y., Kim, Y. Myaeng, S.-H.: Domain-specific sentiment analysis using contextual feature generation. In: TSA (2009)
Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20(1), 37–46 (1960)
Article Google Scholar
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: WSDM (2008)
Google Scholar
Eckert, M., Clark, L., Lind, H., Kessler, J., Nicolov, N.: Structural sentiment and entity annotation guidelines. J. D, Power and Associates Technical Report (2010)
Google Scholar
Fahrni, A., Klenner, M.: Old wine or warm beer: target-specific sentiment analysis of adjectives. In: AISB (2008)
Google Scholar
Ginsca, A.-L.: Fine-grained opinion mining as a relation classification problem. In: Jones A.V. (ed.) ICCSW. OASICS, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, vol. 28, pp. 56–61. Germany (2012)
Google Scholar
Girju, R., Badulescu, A., Moldovan, D.: Automatic discovery of part-whole relations. Comput. Linguist. 32(1), 83–135 (2006)
Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: KDD (2004)
Google Scholar
Jbara, A.A.: Using natural language processing to mine multiple perspectives from social media and scientific literature. Ph.D. Dissertation, The University of Michigan (2013)
Google Scholar
Karttunen, L., Zaenen, A.: Veridicity. In: Annotating, extracting and reasoning about time and events (2005)
Google Scholar
Kessler, W., Kuhn, J.: Detection of product comparisons - how far does an out-of-the-box semantic role labeling system take you?. In: EMNLP, pp. 1892–1897. ACL (2013)
Google Scholar
Kessler, J.S.: Polling the blogosphere: a Rule-Based approach to belief classification. In: ICWSM (2008)
Google Scholar
Kessler, J.S., Nicolov, N.: Targeting sentiment expressions through supervised ranking of linguistic configurations. In: ICWSM (2009)
Google Scholar
Kessler, J.S., Eckert, M., Clark, L., Nicolov, N.: The 2010 ICWSM JDPA sentiment corpus for the automotive domain. In: CWSM-DWC (2010)
Google Scholar
Kim, S.-M., Hovy, E.: Determining the sentiment of opinions. In: COLING (2004)
Google Scholar
Kim, S.-M., Hovy, E.: Extracting opinions, opinion holders, and topics expressed in online news media text. In: ACL Workshop on sentiment and subjectivity in text (2006)
Google Scholar
Krestel, R., Witte, R., Bergler, S.: Minding the source: automatic tagging of reported speech in newspaper articles. In: LREC (2008)
Google Scholar
Moilanen, K., Pulman, S.: Multi-entity sentiment scoring. In: RANLP (2009)
Google Scholar
Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: ACL (2002)
Google Scholar
NIST Speech Group. The ace 2006 evaluation plan: evaluation of the detection and recognition of ace entities, values, temporal expressions, relations, and events (2006)
Google Scholar
Nowson, S.: Scary movies good, scary flights bad: topic driven feature selection for classification of sentiment. In: TSA ( 2009)
Google Scholar
Ogren, P.V.: Knowtator: a protégé plug-in for annotated corpus construction. In: NAACL-HLT (2006)
Google Scholar
Polanyi, L., Zaenen, A.: Contextual valence shifters. In: Computing attitude and affect in text: theory and applications (2006)
Google Scholar
Pustejovsky, J., Hanks, P., Sauri, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., Lazo, M.: The timebank corpus. In: Corpus Linguistics (2003)
Google Scholar
Rubin, V.L.: Stating with certainty or stating with doubt: intercoder reliability results for manual annotation of epistemically modalized statements. In: NAACL-HLT (2007)
Google Scholar
Ruppenhofer, J., Somasundaran, S., Wiebe, J.: Finding the sources and targets of subjective expressions. In: LREC (2008)
Google Scholar
Shaikh, M.A.M., Prendinger, H., Ishizuka, M.: Sentiment assessment of text by analyzing linguistic features and contextual valence assignment. Appl. Artif. Intell. 22(6), 558–601 (2008)
Article Google Scholar
Su, F., Markert, K.: From words to senses: a case study of subjectivity recognition. In: COLING (2008)
Google Scholar
Tsur, O., Davidov, D., Rappoport, A.: Icwsm - a great catchy name: semi-supervised recognition of sarcastic sentences in product reviews. In: ICWSM (2010)
Google Scholar
Vaswani, V.: Predicting sentiment-mention associations in product reviews Ph.D. Dissertation, Kansas State University (2012)
Google Scholar
Wiebe, J., Mihalcea, R.: Word sense and subjectivity. In: ACL (2006)
Google Scholar
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. In: LREC (2005)
Google Scholar
Wiegand, M., Klakow, D.: Topic-related polarity classification of blog sentences. In: EPIA (2009)
Google Scholar
Wilson, T., Wiebe, J.: Annotating opinions in the world press. In: SIGdial (2003)
Google Scholar
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT-EMNLP (2005)
Google Scholar
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)
Article Google Scholar
Wilson, T.A.: Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private States. Ph.D. Dissertation, University of Pittsburgh (2008)
Google Scholar
Winston, M.E., Chaffin, R., Herrmann, D.: A taxonomy of part-whole relations. Cognit. Sci. 11(4), 417–444 (1987)
Article Google Scholar
Yu, N., Kübler, S.: Filling the gap: semi-supervised learning for opinion detection across domains. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning, pp. 200–209. Association for Computational Linguistics (2011)
Google Scholar

Download references

Acknowledgements

We would like to thank Prof. Martha Palmer, Prof. James Martin, Prof. Michael Mozer at University of Colorado, and Prof. Michael Gasser at Indiana University and Dr. William Headden at J.D. Power and Associates for their helpful discussions. Dr. Miriam Eckert and Lyndsie Clark assisted with an earlier iteration of the corpus description [20].

Author information

Authors and Affiliations

CDK Global, 605 Fifth Ave S, Ste 800, Seattle, WA, 98104, USA
Jason S. Kessler & Nicolas Nicolov

Authors

Jason S. Kessler
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Nicolov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason S. Kessler .

Editor information

Editors and Affiliations

Department of Computer Science, Vassar College, Poughkeepsie, New York, USA
Nancy Ide
Department of Computer Science, Volen Center for Complex Systems, Brandeis University, Waltham, Massachusetts, USA
James Pustejovsky

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kessler, J.S., Nicolov, N. (2017). The JDPA Sentiment Corpus for the Automotive Domain. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_30

Download citation

DOI: https://doi.org/10.1007/978-94-024-0881-2_30
Published: 17 June 2017
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-024-0879-9
Online ISBN: 978-94-024-0881-2
eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics

The JDPA Sentiment Corpus for the Automotive Domain

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Sentiment Resources: Lexicons and Datasets

SentiML++: An Extension of the SentiML Sentiment Annotation Scheme

Sentiment Analysis: What’s Your Opinion?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The JDPA Sentiment Corpus for the Automotive Domain

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Sentiment Resources: Lexicons and Datasets

SentiML++: An Extension of the SentiML Sentiment Annotation Scheme

Sentiment Analysis: What’s Your Opinion?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation