Abstract
The Chinese sentences in news articles are usually very long, which set up obstacles for further opinion mining steps. Sentence compression is the task of producing a brief summary at the sentence level. Conventional compression methods do not distinguish the opinionated information from factual information in each sentence. In this paper, we propose a weakly supervised Chinese sentence compression method which aiming at eliminating the negligible factual parts and preserving the core opinionated parts of the sentence. No parallel corpus is needed during the compression. Experiments that involve both automatic evaluations and human subjective evaluations validate that the proposed method is effective in finding the desired parts from the long Chinese sentences.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Choi, Y., Breck, E., Cardie, C.: Joint Extraction of Entities and Relations for Opinion Recognition. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 431–439 (2006)
Choi, Y., Cardie, C., Riloff, E., Patwardhan, S.: Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 355–362 (2005)
Clarke, J., Lapata, M.: Models for Sentence Compression: A Comparison across Domains, Training Requirements and Evaluation Measures. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 377–384 (2006)
Cohn, T., Lapata, M.: Large Margin Synchronous Generation and its Application to Sentence Compression. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing and on Computational Natural Language Learning, pp. 73–82 (2007)
Corston-Oliver, S.: Text Compaction for Display on Very Small Screens. In: Proceedings of the NAACL Workshop on Automatic Summarization, pp. 89–98 (2001)
Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: A Parse-and-Trim Approach to Headline Generation. In: Proceedings of HLT-NAACL Text Summarization Workshop and DUC, pp. 1–8 (2003)
Grefenstette, G.: Producing Intelligent Telegraphic Text Reduction to Provide an Audio Scanning Service for the Blind. In: Proceedings of the AAAI Symposium on Intelligent Text Summarization, pp. 111–117 (1998)
Hirao, T., Suzuki, J., Isozaki, H.: A Syntax-Free Approach to Japanese Sentence Compression. In: Proceedings of Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 826–833 (2009)
HowNet, www.keenage.com
Kim, S., Hovy, E.: Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text. In: Proceedings of the ACL Workshop on Sentiment and Subjectivity in Text, pp. 1–8 (2006)
Knight, K., Daniel, M.: Summarization Beyond Sentence Extraction: A Probabilistic Approach to Sentence Compression. Artificial Intelligence 139(1), 91–107 (2002)
Martins,T., Smith, A.: Summarization with A Joint Model for Sentence Extraction and Compression. In: Proceedings of NAACL-HLT Workshop on Integer Linear Programming for NLP, pp. 1–9 (2009)
Nguyen, M., Akira, S., Susumu, H., Tu, B., Masaru, F.: Probabilistic Sentence Reduction Using Support Vector Machines. In: Proceedings of the 20th International Conference on Computational Linguistics, pp. 743–749 (2004)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification Using Machine Learning Techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 79–86 (2002)
Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in IR 2(1-2), 131–135 (2008)
Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: A Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistic, pp. 311–318 (2002)
Seki, Y., Evans, D., Ku, L., Sun, L., Chen, H., Kando, N.: Overview of Multilingual Opinion Analysis Task at NTCIR-7. In: Proceedings of the Seventh NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering, and Cross-Lingual Information Access (2008)
Turney, P.: Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In: 40th Annual Meeting of the Association for Computational Linguistics, pp. 417–424 (2002)
Unno, Y., Ninomiya, T., Miyao, Y., Tsujii, J.: Trimming CFG Parse Trees for Sentence Compression Using Machine Learning Approach. In: Proceedings of the Joint Conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics, pp. 850–857 (2006)
Wu, Y., Zhang, Q., Huang, X., Wu, L.: Phrase Dependency Parsing for Opinion Mining. In: Proceedings of Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 1533–1541 (2009)
Zhu, X., Rosenfeld, R.: Improving Trigram Language Modeling with the World Wide Web. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing, pp. 533–536 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feng, S., Wang, D., Yu, G., Li, B., Wong, KF. (2010). A Chinese Sentence Compression Method for Opinion Mining. In: Cheng, PJ., Kan, MY., Lam, W., Nakov, P. (eds) Information Retrieval Technology. AIRS 2010. Lecture Notes in Computer Science, vol 6458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17187-1_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-17187-1_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17186-4
Online ISBN: 978-3-642-17187-1
eBook Packages: Computer ScienceComputer Science (R0)