Abstract
Currently, sentiment analysis has become a hot research topic in the natural language processing (NLP) field as it is highly valuable for many real applications.. One basic task in sentiment analysis is sentiment classification which aims to predict the sentiment orientation (positive or negative) of a document. Current approaches to this problem are mainly based on supervised machine learning technologies. The main drawback of such approaches lies in their needs of large amounts of labeled data. How to reduce the annotation cost has become an important issue in sentiment classification. In this study, we propose a novel active learning approach to select both "informative" word and document samples for annotation. Experimental results show that our approach apparently outperforms random selection or uncertainty sampling on documents.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: Proceedings of EMNLP 2002, pp. 79–86 (2002)
Li, S., Zong, C.: Multi-domain Sentiment Classification (short paper). In: Proceedings of ACL 2008, pp. 257–260 (2008)
Melville, P., Gryc, W., Lawrence, R.: Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification. In: Proceedings of KDD 2009, pp. 1275–1284 (2009)
Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis using Subjectivity Summarization based on Minimum Cuts. In: Proceedings of ACL 2004, pp. 271–278 (2004)
Riloff, E., Patwardhan, S., Wiebe, J.: Feature Subsumption for Opinion Analysis. In: Proceedings of EMNLP 2006, pp. 440–448 (2006)
McDonald, R., Hannan, K., Neylon, T., Wells, M., Reynar, J.: Structured Models for Fine-to-coarse Sentiment Analysis. In: Proceedings of ACL 2007, pp. 432–439 (2007)
Cui, H., Mittal, V., Datar, M.: Comparative Experiments on Sentiment Classification for Online Product Reviews. In: Proceedings of AAAI 2006, pp. 1265–1270 (2006)
Li, S., Huang, C., Zong, C.: Multi-domain Sentiment Classification with Classifier Combination. Journal of Computer Science and Technology (JCST) 26(1), 25–33 (2011)
Li, S., Lee, S., Chen, Y., Huang, C., Zhou, G.: Sentiment Classification and Polarity Shifting. In: Proceeding of COLING 2010, pp. 635–643 (2010b)
Li, S., Huang, C., Zhou, G., Lee, S.: Employing Personal/Impersonal Views in Supervised and Semi-supervised Sentiment Classification. In: Proceedings of ACL 2010, pp. 414–423 (2010a)
Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis: Foundations and Trends. Information Retrieval 2(12), 1–135 (2008)
Hatzivassiloglou, V., McKeown, K.: Predicting the Semantic Orientation of Adjectives. In: Proceedings of ACL 1997, pp. 174–181 (1997)
Wiebe, J.: Learning Subjective Adjectives from Corpora. In: Proceedings of AAAI 2000 (2000)
McCallum, A., Nigam, K.: Employing EM in pool-based active learning for text classification. In: Proceedings of ICML 1998, pp. 350–358 (1998)
Long, J., Yin, J., Zhu, E., Zhao, W.: Active learning research. Research and Development of Computer 45, 300–304 (2008)
Roy, N., McCallum, A.: Toward Optimal Active Learning through Sampling Estimation of Error Reduction. In: Proceedings of ICML 2001, pp. 441–448 (2001)
Lewis, D., Gale, W.: Training Text Classifiers by Uncertainty Sampling. In: Proceedings of SIGIR 1994, pp. 3–12 (1994)
Argamon-Engleson, S., Dagan, I.: Committee-Based Sample Selection For Probabilistic Classifiers. Journal of Artificial Intelligence Research, 335–360 (1999)
Melville, P., Sindhwani, V.: Active Dual Supervision: Reducing the Cost of Annotating Examples and Features. In: Proceedings of NAACL 2009, pp. 49–57 (2009)
Sindhwani, V., Melville, P.: Document-Word Co-Regularization for Semi-supervised Sentiment Analysis. In: Proceedings of ICDM 2008, pp. 1025–1030 (2008)
Sindhwani, V., Hu, J., Mojsilovic, A.: Regularized co-clustering with dual supervision. In: NIPS, pp. 1505–1512 (2008)
Zong, C.: Statistical natural language processing. Tsinghua University Publishing (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ju, S., Li, S. (2013). Active Learning on Sentiment Classification by Selecting Both Words and Documents. In: Ji, D., Xiao, G. (eds) Chinese Lexical Semantics. CLSW 2012. Lecture Notes in Computer Science(), vol 7717. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36337-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-36337-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36336-8
Online ISBN: 978-3-642-36337-5
eBook Packages: Computer ScienceComputer Science (R0)