Abstract
This paper presents a method based on natural language processing (NLP) for single Arabic document summarization. The suggested method based on the extractive method to select the most valuable information in the document. However, working with Arabic text is considered as a challenging task, this chapter tries to produce an accurate result by using some of NLP techniques. The proposed method is formed from three phases, the first one work as a pre-processing phase to unify synonyms terms, stemming, remove punctuation marks and remove text decoration. Consequently, it produces the features vectors and scores these features to start to select the clauses with the highest scores then marks it as important clauses. The suggested method’s results are compared versus the traditional methods. In this context, two human experts summarized all the datasets manually in order to prepare a strong compare and effective evaluation of the suggested method. In the evaluation phase, some of the performance measures include accuracy, precision, recall, f-measure, and Rouge measure are used. The experimental results denoted that the suggested method showed a competitive execution compared with the human experts in summarization ratio as well as in the accuracy of the produced document.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
A. Nenkova, K. Mckeown, Automatic Summarization (USA, 2011), p. 1
S. Suneetha, automatic text summarization: the current state of the art. Int. J. Sci. Adv. Technol. 1(9), (2011), ISSN: 2221-8386
R. Mol, Sabeeha: an automatic document summarization system using a fusion method. Int. Res. J. Eng. Technol. (IRJET), 3 (2016), ISSN: 2395-0056
Y. Rajput, P. Saxena, A combined approach for effective text mining using node clustering. Int. J. Adv. Res. Comput. Commun. Eng. 5(4), 321–324 (2016), ISSN: 2319 5940
N. Bhatia, A. Jaiswal, Literature review on automatic text summarization: single and multiple summarizations. Int. J. Comput. Appl. (IJCA) 117(6), 0975–8887 (2016)
D. Radev, S. Teufel, H. Saggion, W. Lam J. Blitzer A. Celebi, et al., Evaluation of text summarization in a cross-lingual information retrieval framework, (2011)
S. Lagrini, M. Redjimi, N. Azizi, Automatic arabic text summarization approaches. Int. J. Computer Appl. 164(5) (2017)
A. Al-Saleh, M. Menail, Automatic Arabic text summarization: a survey. Artif. Intell. Rev. Arch 45(2), 203–234 (2016)
M. Tafiqe, Y. Farag, M. Younis, Comparative and Contrastive Linguistics (Cairo University, 2014)
A. Basiony, Computer for extracting knowledge and opinion mining (Dar El Kotb El-elmia for publishing, Cairo-Egypt, 2011)
H. Oufaida, O. Noualib, P. Blache, Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization. J. King Saud Univ.-Comput. Inf. Sci. 450–461 (2014)
K. Merchant, Y. Pande, NLP based latent semantic analysis for legal text summarization, in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (IEEE, 2018), pp. 1803–1807
A. Khan, N. Salim, H. Farman, M. Khan, B. Jan, A. Ahmad, A. Paul, Abstractive text summarization based on improved semantic graph approach. Int. J. Parallel Prog. 46(5), 992–1016 (2018)
D.B. Patel,, S. Shah, H.R. Chhinkaniwala, Fuzzy logic based multi Document Summarization with improved sentence scoring and redundancy removal technique, Expert. Syst. Appl. (2019)
M.R. Chaud, A. Di Felippo, Exploring content selection strategies for multilingual multi-document summarization based on the universal network language (UNL). Revista de Estudos da Linguagem 26(1), 45–71 (2018)
Cagliero, L., Garza, P., Baralis, E.: ELSA: a multilingual document summarization algorithm based on frequent itemsets and latent semantic analysis. ACM Trans. Inf. Syst. (TOIS), 37(2) (2019)
S. Narayan, S.B. Cohen, M. Lapata, Ranking sentences for extractive summarization with reinforcement learning. arXiv preprint arXiv:1802.08636 (2018)
C. Kedzie, K. McKeown, H. Daume III, Content selection in deep learning models of summarization, arXiv preprint arXiv:1810.12343 (2018)
S. Song, H. Huang, T. Ruan, Abstractive text summarization using LSTM-CNN based deep learning. Multimed. Tools Appl. 78(1), 857–875 (2019)
M.S. Bewoor, S.H. Patil, Empirical analysis of single and multi document summarization using clustering algorithms. Eng., Technol. Appl. Sci. Res. 8(1), 2562–2567 (2018)
H. Van Lierde, T.W. Chow, Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization. Inf. Sci. 496, 212–224 (2019)
P. Wu, Q. Zhou, Z. Lei, W. Qiu, X. Li: Template oriented text summarization via knowledge graph, in 2018 International Conference on Audio, Language and Image Processing (ICALIP) (IEEE, 2018), pp. 79–83
Y. Wu, R. Chen, C. Li, S. Chen, W. Zou, Automatic summarization generation technology of network document based on knowledge graph, in International Conference on Advanced Hybrid Information Processing, (Springer, Cham, 2018), pp. 20–27
C. Mallick, A.K. Das, M. Dutta, A.K. Das, A. Sarkar, Graph-based text summarization using modified TextRank, in Soft Computing in Data Analytics, (Springer, Singapore, 2019), pp. 137–146
A. Cohan, N. Goharian, Scientific article summarization using citation-context and article’s discourse structure. arXiv preprint arXiv:1704.06619 (2017)
X. Wang, Y. Yoshida, T. Hirao, K. Sudoh, M. Nagata, Summarization based on task-oriented discourse parsing. IEEE Trans. Audio Speech Lang. Process. 23(8), 1358–1367 (2015)
R. Rautray, R.C. Balabantaray, Cat swarm optimization based evolutionary framework for multi document summarization. Phys. A 477, 174–186 (2017)
J.M. Sanchez-Gomez, M.A. Vega-Rodríguez, C.J. Pérez, Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach. Knowl.-Based Syst. 159, 1–8 (2018)
M.A. Mosa, A.S. Anwar, A. Hamouda, A survey of multiple types of text summarization based on swarm intelligence optimization techniques (2018)
L. Suanmali, N. Salim, M.S. Binwahlan, Genetic algorithm based sentence extraction for text summarization. Int. J. Innov. Comput. 1(1), (2011)
Keskes, I., Lhioui, M., Benamara, F., Belguith, L.: Automatic summarization of Arabic texts biased on segmented discourse representation theory international computing conference in Arabic (ICCA, 26–28 December, Egypt 2012)
K. Nandhini, S.R. Balasundaram, Use of genetic algorithm for cohesive summary extraction to assist reading difficulties. Appl. Comput. Intell. Soft Comput. (2013)
F.G. El Sherief, Towards A Hybrid Framework for Automatic Arabic Summarizer, Unpublished Ph.D’s thesis, Faculty of Computer and Information, Cairo University (2015)
H. Froud, A. Lachkar, S. Ouatik, Arabic text summarization based on latent semantic analysis to enhance arabic documents clustering. Colloq. Inf. Sci. Technol. (CIST) 22–24 October (2016)
Y.A. Jaradat, A.T. Al-Taani, Hybrid-based Arabic single-document text summarization approach using genatic algorithm, in 2016 7th International Conference on Information and Communication Systems (ICICS), (IEEE, 2016), pp. 85–91
R.S. Baraka, S.N. Al Breem, Automatic arabic text summarization for large scale multiple documents using genetic algorithm and mapreduce, in 2017 Palestinian International Conference on Information and Communication Technology (PICICT), (IEEE, 2017), pp. 40–45
A.M. Azmi, N.I. Altmami, An abstractive Arabic text summarizer with user controlled granularity. Inf. Process. Manage. 54(6), 903–921 (2018)
Y.C. Shekhar, A. Sharan, Hybrid approach for single text document summarization using statistical and sentiment features. Int. J. Inf. Retr. Res. (IJIRR), 46–70 (2015)
Y.K. Menna, D. Gopalani, Feature priority based sentence filtering method for extractive automatic text Summarization (2015)
J. Singh, V. Gupta, A systematic review of text stemming techniques (2016)
A. Haboush, A. Momani, M. Al-Zoubi, M. Tarazi: Arabic text summarization model using clustering techniques. World Comput. Sci. Inf. Technol. J. WCSIT, 2(3) 62–67 (2012)
M.M. Refaat, A.A. Ewees, M.M. Eisa, A.A. Sallam, Automated assessment of students’ arabic free-text answers. Int. J. Intell. Comput. Inf. Sci. 12(1), 213–222 (2012)
N. El-Fishawy, A. Hamouda, G. Attiya, M. Atef, Arabic summarization in Twitter social network. Ain Shams Eng. J. 5(2), 411–420 (2014)
A.A. Ewees, M. Eisa, M.M. Refaat, Comparison of cosine similarity and k-NN for automated essays scoring. Cogn. Process. 3(12) (2014)
R.A. Ibrahim, et al., Galaxy images classification using hybrid brain storm optimization with moth flame optimization. J. Astron. Telesc., Instrum., Syst. 4(3), 038001 (2018)
E.H. Houssein, A.E. Ahmed, Mohamed Abd ElAziz. Improving twin support vector machine based on hybrid swarm optimizer for heartbeat classification. Pattern Recognit. Image Anal. 28(2), 243–253 (2018)
M Abd Elaziz, A.A. Ewees, A.E. Hassanien, Multi-objective whale optimization algorithm for content-based image retrieval. Multimed. Tools Appl. 77(19), 26135–26172 (2018)
M. Boudabous, M. Maaloul, I. Keskes, L. Belguith. Automatic summarization of arabic texts between digital learning theory and rhetorical structure theory. Commun. ACS, 4(2) (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bialy, A.A., Gaheen, M.A., ElEraky, R.M., ElGamal, A.F., Ewees, A.A. (2020). Single Arabic Document Summarization Using Natural Language Processing Technique. In: Abd Elaziz, M., Al-qaness, M., Ewees, A., Dahou, A. (eds) Recent Advances in NLP: The Case of Arabic Language. Studies in Computational Intelligence, vol 874. Springer, Cham. https://doi.org/10.1007/978-3-030-34614-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-34614-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34613-3
Online ISBN: 978-3-030-34614-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)