Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization

Barrera, Araly; Verma, Rakesh

doi:10.1007/978-3-642-28601-8_31

Araly Barrera¹⁷ &
Rakesh Verma¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7182))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1508 Accesses
21 Citations

Abstract

The goal of automated summarization is to tackle the “information overload” problem by extracting and perhaps compressing the most important content of a document. Due to the difficulty that single-document summarization has in beating a standard baseline, especially for news articles, most efforts are currently focused on multi-document summarization. The goal of this study is to reconsider the importance of single-document summarization by introducing a new approach and its implementation. This approach essentially combines syntactic, semantic, and statistical methodologies, and reflects psychological findings that pinpoint specific selection patterns as humans construct summaries. Successful summary evaluation results and baseline out-performance are demonstrated when our system is executed on two separate datasets: the Document Understanding Conference (DUC) 2002 data set and a scientific magazine article set. These results have implications not only for extractive and abstractive single-document summarization, but could also be leveraged in multi-document summarization.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Recent advances in document summarization

Article 28 March 2017

Multi-Document Extractive Summarization as a Non-linear Combinatorial Optimization Problem

Robust Single-Document Summarizations and a Semantic Measurement of Quality

References

Angheluta, R., Mitra, R., Jing, X., Moens, M.-F.: K.U. Leuven Summarization System at DUC 2004. Available on the Web (2004)
Google Scholar
Arora, R., Ravindran, B.: Latent Dirichlet Allocation and Singular Value Decomposition based Multi-Document Summarization. In: ICDM 2008: Proceedings of the 2008 Eighth IEEE Int’l Conf. on Data Mining, pp. 713–718 (2008)
Google Scholar
Baxendale, P.: Machine-made Index for Technical Literature - An Experiment. IBM Journal of Research Development 2(4), 354–361 (1958)
Article Google Scholar
Brin, S., Page, L.: The Anatomy of Large-scale Hypertextual Web Search Engine. Computer Networds and ISDN Systems 30, 1–7 (1998)
Article Google Scholar
Edmundson, H.: New Methods in Automatic Extraction. Journal of ACM 16(2), 264–285 (1969)
Article MATH Google Scholar
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press (1998)
Google Scholar
Hovy, E., Lin, C.: Automatic Text Summarization in SUMMARIST. In: Mani, Maybury, M. (eds.) Adv. in Text Summarization, vol. 1. MIT Press (1999)
Google Scholar
Ishikawa, K.: Trainable Automatic Text Summarization Using Segmentation of Sentence. In: Proceedings of the Third NTCIR Workshop (2003)
Google Scholar
Li, S., Wang, W., Wang, C.: TAC 2009 Update Summarization Task of ICL. In: Text Analysis Conference 2008 (2008)
Google Scholar
Lin, C.: ROUGE: A Package for Automatic Evaluation of Summaries. In: Proceedings of Workshop on Text Summarization Post-Conference Workshop (ACL 2004), Barcelona, Spain (2004)
Google Scholar
Lin, C., Hovy, E.: Automatic Evaluation of Summaries Using n-gram Co-occurrence Statistics. In: HTL-NAACL (2003)
Google Scholar
Lin, C.-Y., Hovy, E.H.: Identifying topics by position. In: ANLP, pp. 283–290 (1997)
Google Scholar
Lorch, R., Lorch, E.: Effects of Headings of Text Recall and Summarization. Contemporary Educational Psychology 21, 261–278 (1996)
Article Google Scholar
Mani, I., Maybury, M.: Advances in Automatic Summarization. MIT Press, Cambridge (1999)
Google Scholar
Mihalcea, R., Ceylan, H.: Explorations in Automatic Book Summarization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2007), Prague (2007)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2004 (March 2004)
Google Scholar
Nenkova, A.: Automatic Text Summarization of Newswire: Lessons Learned from the document understanding conference. In: AAAI, pp. 1436–1441 (2005)
Google Scholar
Nenkova, A.: A General Introduction to Automatic Summarization (2009), http://webcast.jhu.edu/mediasite/Viewer/?peid=8cd235b1699a457f9c776c12d4925408
Radev, D., Allison, T.: Mead - a Platform for Multidocument Multilingual Text Summarization. In: LREC (2004)
Google Scholar
Radev, D., Jing, H., Stys, M., Tam, D.: Centroid-based Summarization of Multiple Documents. Information Proc. and Mgmt. 40, 919–938 (2004)
Article MATH Google Scholar
Svore, K.M., Vanderwende, L., Burges, C.J.C.: Enhancing Single-document Summarization by Combining RankNet and Third-Party Sources. In: EMNLP-CoNLL, pp. 448–457 (2007)
Google Scholar
Verma, R., Filozov, F.: Document Map and WN-Sum: A new framework for automatic text summarization and a first implementation. Technical Report UH-CS-10-03, University of Houston Computer Science Dept. (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Houston, Houston, TX, USA
Araly Barrera & Rakesh Verma

Authors

Araly Barrera
View author publications
You can also search for this author in PubMed Google Scholar
Rakesh Verma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research (CIC), National Polytechnic Institute (IPN), Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barrera, A., Verma, R. (2012). Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-28601-8_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28600-1
Online ISBN: 978-3-642-28601-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization

Abstract

Chapter PDF

Similar content being viewed by others

Recent advances in document summarization

Multi-Document Extractive Summarization as a Non-linear Combinatorial Optimization Problem

Robust Single-Document Summarizations and a Semantic Measurement of Quality

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization

Abstract

Chapter PDF

Similar content being viewed by others

Recent advances in document summarization

Multi-Document Extractive Summarization as a Non-linear Combinatorial Optimization Problem

Robust Single-Document Summarizations and a Semantic Measurement of Quality

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation