Abstract
Multi-document summarization has been used for extracting the most relevant sentences from a set of documents, allowing the user to more quickly address the content thereof. This paper addresses the generation of extractive summaries from multiple documents as a binary optimization problem and proposes a method, based on CHC evolutionary algorithm and greedy search, called MA-MultiSumm, in which objective function optimizes the lineal combination of coverage and redundancy factors. MA-MultiSumm was compared with other state-of-the-art methods using ROUGE measures. The results showed that MA-MultiSumm outperforms all methods on the DUC2005 dataset; and on DUC2006 the results are very close to the best method. Furthermore in a unified ranking MA-MultiSumm only was improved on by the DESAMC+DocSum method, which requires as many iterations of the evolutionary process as MA-MultiSumm. The experimental results show that the optimization-based approach for multiple document summarization is truly a promising research direction.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artificial Intelligence Review 37(1), 1–41 (2012)
Nenkova, A., McKeown, K.: A Survey of Text Summarization Techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, US (2012)
Miranda, S., Gelbukh, A., Sidorov, G.: Generación de resúmenes por medio de síntesis de grafos conceptuales. Revista Signos. Estudios de Lingüística 47(86) (2014)
Amini, M.-R., Usunier, N.: Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization. In: Proceedings of 32nd Annual ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, USA, pp. 704–705. ACM (2009)
Ouyang, Y., et al.: Applying regression models to query-focused multi-document summarization. Information Processing & Management 47(2), 227–237 (2011)
Chen, Y.-M., Wang, X.-L., Liu, B.-Q.: Multi-document summarization based on lexical chains. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, pp. 1937–1942. IEEE (1937)
Atkinson, J., Munoz, R.: Rhetorics-based multi-document summarization. Expert Systems with Applications 40(11), 4346–4352 (2013)
Otterbacher, J., Erkan, G., Radev, D.R.: Biased LexRank: passage retrieval using random walks with question-based priors. Information Processing and Management 45(1), 42–54 (2009)
Wei, F., et al.: Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 283–290. ACM (2008)
Radev, D.R., et al.: Centroid-based summarization of multiple documents. Information Processing & Management 40(6), 919–938 (2004)
Steinberger, J., Křišťan, M.: LSA-Based Multi-Document Summarization. In: Proceedings of 8th International PhD Workshop on Systems and Control, Balatonfured, Hungary (2007)
Sun, P., ByungRae, C.: Query-Based Multi-Document Summarization Using Non-Negative Semantic Feature and NMF Clustering. In: Proceedings Fourth International Conference on Networked Computing and Advanced Information Management, NCM, Gyeongju, pp. 609–614. IEEE (2008)
Hennig, L.: Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis. In: Proceedings International Conference RANLP, Borovets, Bulgaria, pp. 144–149 (2009)
Mei, J.-P., Chen, L.: SumCR: a new subtopic-based extractive approach for text summarization. Knowledge and Information Systems 31(3), 527–545 (2012)
Alguliev, R.M., et al.: MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 14514–14522 (2011)
Alguliev, R.M., Aliguliyev, R.M., Isazade, N.R.: DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization. Knowledge-Based Systems 36(0), 21–38 (2012)
Abuobieda, A., Salim, N., Kumar, Y.J., Osman, A.H.: An Improved Evolutionary Algorithm for Extractive Text Summarization. In: Selamat, A., Nguyen, N.T., Haron, H., et al. (eds.) ACIIDS 2013, Part II. LNCS, vol. 7803, pp. 78–89. Springer, Heidelberg (2013)
Mendoza, M., et al.: Extractive single-document summarization based on genetic operators and guided local search. Expert Systems with Applications 41(9), 4158–4169 (2014)
Neri, F., Cotta, C.: Memetic algorithms and memetic computing optimization: A literature review. Swarm and Evolutionary Computation 2(0), 1–14 (2012)
Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Hachey, B., Murray, G., Reitter, D.: The Embra System at DUC 2005: Query-oriented Multi-document Summarization with a Very Large Latent Semantic Space. In: Proceedings of the Document Understanding Conference (DUC), Vancouver, Canada (2005)
Silla, C.N., Pappa, G.L., Freitas, A.A., Kaestner, C.A.A.: Automatic text summarization with genetic algorithm-based attribute selection. In: Lemaître, C., Reyes, C.A., González, J.A. (eds.) IBERAMIA 2004. LNCS (LNAI), vol. 3315, pp. 305–314. Springer, Heidelberg (2004)
Ochoa, G., Verel, S., Tomassini, M.: First-improvement vs. Best-improvement local optima networks of NK landscapes. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 104–113. Springer, Heidelberg (2010)
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL-04 Workshop on Text Summarization Branches Out, Barcelona, Spain (2004)
Alguliev, R.M., Aliguliyev, R.M., Mehdiyev, C.A.: Sentence selection for generic document summarization using an adaptive differential evolution algorithm. Swarm and Evolutionary Computation 1(4), 213–222 (2011)
Celikyilmaz, A., Hakkani-Tur, D.: A Hybrid Hierarchical Model for Multi-Document Summarization. In: Proceedings 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 815–824. Association for Computational Linguistics (2010)
Lei, H., et al.: Modeling Document Summarization as Multi-objective Optimization. In: Third International Symposium on Intelligent Information Technology and Security Informatics (IITSI), China, pp. 382–386. IEEE (2010)
Wei, F., Li, W., Liu, S.: iRANK: a rank-learn-combine framework for unsupervised ensemble ranking. American Society for Information Science and Technology 61(6), 1232–1243 (2010)
Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Colorado, pp. 362–370. Association for Computational Linguistics (2009)
Wang, D., et al.: Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 307–314 (2008)
Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: Proceedings of the Ninth SIAM International Conference on Data Mining, Nevada, USA, pp. 1148–1159 (2009)
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and development in Information Retrieval, Melbourne, Australia, pp. 335–336. ACM (1998)
Eiben, A.E., Smit, S.K.: Evolutionary Algorithm Parameters and Methods to Tune Them. In: Hamadi, Y., Monfroy, E., Saubion, F. (eds.) Autonomous Search, pp. 15–36. Springer, Heidelberg (2012)
Cobos, C., Estupiñán, D., Pérez, J.: GHS + LEM: Global-best Harmony Search using learnable evolution models. Applied Mathematics and Computation 218(6), 2558–2578 (2011)
Sidorov, G., et al.: Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas 18(3) (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Mendoza, M., Cobos, C., León, E., Lozano, M., Rodríguez, F., Herrera-Viedma, E. (2014). A New Memetic Algorithm for Multi-document Summarization Based on CHC Algorithm and Greedy Search. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-13647-9_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)