Abstract
Today, given the increasing volume of information and the difficulty of using them for specific applications such as email, websites, news, etc., the use of automated information summarization algorithms has become more popular than traditional algorithms. Taking advantage of computer algorithms, these algorithms produce a summary of information while retaining its original meaning. However, given its semantic and structural properties as well as the variable comparative parameters of information, the summarization process is considered as an NP-hard problem. Therefore, to solve these problems it is better to use meta-heuristic algorithms, which are generally inspired by the behavior of nature. These meta-heuristic algorithms help to better solve the hard problems through producing optimum solutions. In this paper, we propose an optimization algorithm named multi-agent meta-heuristic optimization algorithm (MAMHOA) for extractive text summarization. MAMHOA is a combination of biogeography-based optimization (BBO) algorithm and multi-agent systems concepts to generate an optimum summary. Several computational tests are used to evaluate the effectiveness and efficiency of the proposed algorithm, which is compared to other algorithms provided in the literature. MAMHOA and other algorithms are tested on DUC2002 datasets and attained solutions are analyzed using ROUGE metrics. From the results obtained, it can be seen that the proposed algorithm is more effective and efficient than those mentioned in the literature i.e. Baseline and state-of-the-art methods for different ROUGE metrics.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
The goal of optimization is to get the best or near the best acceptable solution, given the constraints and needs of the problem. For a problem, there may be several different solutions that are defined to be compared and then an optimal solution is selected. Research has shown that meta-heuristic algorithms and multi-agent systems can be used as a solution for improving the optimization process. Meta-heuristic algorithms are algorithms that are inspired by nature, physics, and humans and are used to solve many optimization problems. Optimization algorithms are usually used in combination with other algorithms to reach the optimal solution or to exit from the local optimal solution. Multi-agent systems are among the new approaches that have been proposed to achieve the above-mentioned capabilities. Agent-based systems in the field of distributed artificial intelligence have been studied for several years and have been used in various fields of computer science. Agents are able to perceive and react against the environment, having an extensive range of meanings depending on the problem to be solved. Generally, an agent can be characterized by the following four properties: (1) living and acting in an environment; (2) having the ability to sense its local surroundings; (3) acting based on a specific purpose; and (4) having some reactive behaviors (Li and Jing 2016). In a summarization problem, the goal is to create the best or near the best optimum summary of one document or more documents, which consist of three features obtained by extracting the most important contents. These features are as follows (Oufaida et al. 2014):
-
Coverage All the information of important topics in the documents should be covered by the summary.
-
Non-redundancy It indicates the novelty in the summary.
-
Relevancy All the information of contents that are relevant to a user should be contained in the summary.
In general, summarization algorithms can be categorized into two classes, namely, extractive and abstractive summarization (Ferreira et al. 2013). Extractive summarization has to do with concatenating multiple relevant sentences which are chosen without any modification, i.e. exactly as they are used in the original document. Moreover, creating abstractive summaries is a difficult task since it requires paraphrasing sections of the source document. Therefore, it needs natural language generation tools. Furthermore, abstractive algorithms may reuse clauses or phrases from original documents (Nenkova et al. 2011). In this paper, we proposed an algorithm named MAMHOA algorithm, which is a combination of a biogeography meta-heuristic algorithm and multi-agent systems aimed at improving extractive text summarization. In MAMHOA, a set of agents i.e. habitats, is considered which are competing or cooperating with their neighbors to find the best conditions and best habitats. The agents collaborate in a multi-agent system to achieve a specific, shared goal, and each agent performs its own task. This collaboration can be done through communication and exchange of information with neighbors of each agent. The performance of the proposed algorithm is evaluated for single-document extractive text summarization on DUC 2002Footnote 1 datasets. As for evaluation, the proposed algorithm is compared with some of the Baseline and state-of-the-art methods of extractive text summarization.
The rest of the paper is organized as follows: In Sect. 2, the literature review is discussed. In Sect. 3, the original Biogeography-based optimization is explained. In Sect. 4, the proposed algorithm, i.e. MAMHOA is introduced and assessed on the single-document extractive text summarization. In Sect. 5, the results of experiments are presented and discussed, and finally in Sect. 6 the paper is concluded.
2 Literature review
Our work generates a new meta-heuristic optimization algorithm for single-document extractive text summarization based on multi-agent concepts. In the literature, we will briefly review some of the methods in field of text summarization. As a first algorithm, Luhn (1958) is used for text summarization based on statistical features such as the frequency of words and the relative positioning of each sentence in the document. The authors in (Mihalcea and Tarau 2004) introduced a graph-based technique called TextRank. This algorithm represents the document in the form of a graph of sentences, and the edges between two sentences are connected based on the similarity between them. LexRank (Radev and Erkan 2004) applied the notion of eigenvector centrality to the graph representation of sentences, with the aim of estimating the importance of each sentence in the document. Moreover, Steinberger and Jezek (2004) put forth a topic-based algorithm named Latent Semantic Analysis (LSA), which seeks to spot salient sentences in the document through applying Singular Value Decomposition (SVD) over document matrix D of size m× n, consisting of m sentences and n number of terms. Nenkova and Vanderwende (2005) developed a greedy search approximation algorithm called SumBasic that makes use of a frequency-based sentence selection component with a component to re-weight the word probabilities in order to minimize redundancy. Amancio and Tohalino (2018) proposed an adapted model, which includes sentences and edges, which are created given the number of shared words between sentences. The proposed algorithm distinguishes between edges linking sentences in different documents (interlayer) and those connecting sentences from the same document (intra-layer). Rautray and Balabantaray (2018) put forth a new Cuckoo search–based multi-document summarizer (MDSCSA) to deal with the problem of multi-document summarization. They compare the new MDSCSA with two other nature-based summarization techniques, including Particle swarm optimization based summarization (PSOS) and Cat swarm optimization based summarization (CSOS). Haghighi and Vanderwende (2009) developed an algorithm for text summarization, which generates the output summary of the document because of the criteria of Kullback-Lieber (KL) divergence. As a phrase-based summarization technique, Integer Linear Programming (Woodsend and Lapata 2010) encodes the grammaticality constraints across phrase dependencies using integer linear programming. Wan (2010) proposed URANK, which is a unified rank methodology based on graph model. It is able to handle single-document and multi-document summarization simultaneously. Asgari and Masoumi (2013) provided an algorithm to improve the performance of text summarization using bacterial foraging optimization algorithm. Egraph (Parveen et al. 2015) is considered as an entity graph-based system, which makes use of a bipartite graph of sentence and entity nodes for document representation. They developed a system called “Egraph + Coh” which includes a coherence measure calculated by using the averaged-out degree of an unweighted projection graph. Tgraph (Parveen et al. 2015) is an unsupervised graph-based text summarization algorithm that applies Latent Dirichlet Allocation (LDA) for topic modeling. “Tgraph + Coh” takes account of coherence calculated using weighted projection graph for generating a summary. Cheng and Lapata (2016) developed NN–SE, which is a neural summarization model for sentence extraction. SummaRuNNer (Nallapati et al. 2016) is a recent extractive text summarization algorithm, which is based on RNNs. Mirshojaee et al. (2017) used an original biogeography-based optimization algorithm for extractive text summarization. The proposed algorithm is inspired by the natural migration of species between habitats.
CoRank (Fang et al. 2017) is a graph-based unsupervised extractive text summarization approach, which involves combining word-sentence relationship with the graph-based ranking model. Now we discuss some approaches, which use BBO meta-heuristic algorithm in field of optimization. Zhang et al. (2018) presented a novel hybrid algorithm based on BBO and Grey wolf optimizer (GWO), named as HBBOG. In the proposed algorithm, both algorithms (BBO and GWO) are improved. Yang and Li (2017) introduced an improved biogeography-based optimization algorithm that uses the nonlinear migration operator. They have used this proposed algorithm in mobile robot’s path planning. Chen and Tianfield (2016) developed a covariance matrix-based migration (CMM) to reduce BBO’s dependence on the coordinate system, resulting in the enhancement of BBO’s rotational invariance. Khishe et al. (2017) proposed a novel exponential-logarithmic migration operator for the original BBO, named as ELBBO. Paraskevopoulos et al. (2017) proposed a modified BBO named as real-coded biogeography-based optimization (RCBBO) that combined fuzzy decision making for the cognitive radio engine design used in the internet of things (IoT). To improve the population diversity, the authors employed Gaussian, Cauchy and Levy mutations as the mutation operator. Feng et al. (2017) proposed a modified BBO named as PRBBO that was improved for solving global optimization problems. In the modified algorithm, triple combination includes, migration operator combined with random ring topology, a modified mutation operator, and a self-adaptive Pow-ell’s algorithm. In the proposed algorithm, local ring topology is used instead of the global topology to increase population diversity. Lohokare et al. (2013) presented a mimetic BBO named as ABBOMDE, in which the performance of BBO is accelerated with the help of a modified mutation and clear duplicate operators.
3 Biogeography-based optimization algorithm
In this section, we introduce the original BBO algorithm, which is used in the proposed algorithm. BBO is a new intelligence-based meta-heuristic algorithm inspired by nature based on the concept of animal migration to find the best habitat (Simon 2008; Harish 2015). In BBO, each habitat represents a candidate solution of the problem. Species migrate from one habitat to another habitat depending on a suitability index called habitat suitability index (HSI). HSI actually represents the individual’s fitness. Parameters such as rainfall, temperature, region, humidity, etc., affect the excellent characteristics of biological habitats. In BBO algorithm, from Simon’s point of view, these characteristics are called suitability index variables (SIVs). Simply n-dimensional habitat, which is a candidate solution of the problem, is formed by n SIVs whose fitness is denoted by HSI. In BBO (Fig. 1 ), each habitat has an immigration rate (λ) which means that it has a desire to accept poor habitats, and emigration rate (μ) which means that there is a strong tendency to migrate to rich habitat. In BBO, the solution is improved with a focus on the immigration and emigration of solution features within habitats. High HSI habitats share their good features with low HSI habitats, with low HSI habitats accepting the novel features of high HSI habitats. In accordance with the principle of the BBO, it is highly probable that good solutions share their SIVs with other solutions and poor solutions accept SIVs from other solutions. There is a decrease in emigration rates from high HSI habitats to low HSI ones so that the highest HSI habitat has the maximum emigration rate while there is an increase in immigration rate from high HSI habitats to low HSI ones, with the highest HSI habitat having the lowest immigration rate.
Immigration rate λ and emigration rate μ are calculated according to the formulas (1) and (2).
The parameters of \( \lambda_{i} \) and \( \mu_{i} \) are the immigration rate and emigration rate for ith habitat, respectively. The parameters of I and E are the maximum immigration rate and maximum emigration rate, respectively, and n denotes the population size. The \( k_{i} \) stands for the fitness rank of ith habitat after fitness of ith habitat is sorted, so that worst solution has \( k_{i} \) of 1 and the best solution has \( k_{i} \) of n. Migration and mutation are two crucial operators in BBO. The migration operator is responsible for generating new solution at each iteration and is similar to the crossover operator in the evolutionary algorithm. The mutation operator is done randomly on habitats and is responsible for maintaining the diversity of habitats and preventing the trapping of the algorithm in the local optimal point. According to formulas (3) and (4), the parameter of \( P_{s} \) is the probability which denotes s species in the habitat ranging from t to \( (t + \Delta t) \) where \( \lambda_{s} \) is immigration rate and \( \mu_{s} \) is emigration rate when there are s species in the habitat.
At time \( t + \Delta t \), one of the following conditions must hold for s species in the habitat.
-
1.
If there were s species in the habitat at time t, then there is no immigration and no emigration of species within time t and \( t + \Delta t \).
-
2.
If there were (s – 1) species in the habitat at time t, then there is one species immigrated within time t and \( t + \Delta t .\)
-
3.
If there were (s + 1) species in the habitat at time t, there is one species emigrated within time t and \( t + \Delta t \).
To ignore the probability of more than one immigration or emigration, we set \( \Delta t \) very small so that \( \Delta t \to 0 .\)
Solutions with very high HIS and very low HSI are equally improbable. Medium HSI solutions are relatively probable. The mutation rate is expressed by formula (5) where mmax denotes a user-defined parameter, \( p_{i} \) denotes priori existence probability, and pmax = max {pi, i = 1, 2… ps}.
In the case of the original mutation operator, a random value, created in the whole solution space, is used to probabilistically substitute SIV in each solution. Mutation represents the sudden changes in the habitat. This operator is tasked with maintaining the diversity in the population during BBO process. Mutation randomly makes minor changes to habitat SIVs based on the habitat’s a priori probability. The algorithm in Fig. 2, indicates the pseudo-code of BBO algorithm with mutation and migration operators, where \( H_{i} \) and \( H_{j} \) represent the ith habitat and the jth habitat, respectively. N denotes the highest number of species and D denotes the dimension of a solution.
Finally, following the migration and mutation, the specific habitat is compared against the criterion habitat. If the HSI value of this specific habitat exceeds that of the criterion habitat, the latter is substituted by the former. Otherwise, the criterion habitat should be preserved as a better option. By continuously iterating evolution, the algorithm preserves the better conditions, improving the poor individual and guiding the search process towards the global optimal individual approximation.
4 Proposed MAMHOA algorithm
In this section, we present our proposed algorithm, i.e. a multi-agent meta-heuristic optimization algorithm (MAMHOA) with an approach for single-document extractive text summarization. Given that the proposed algorithm uses BBO algorithm, firstly, we need to consider the following three initial important definitions:
-
1.
In MAMHOA, each agent represents a habitat in BBO, which is a candidate solution.
-
2.
In MAMHOA, all habitats live in a lattice-like environment, where each habitat is placed at one point of the lattice. These habitats seek to increase habitat suitability index (HSI) as well as efficiency through competing or cooperating with their neighbors.
-
3.
The lattice dimension is \( H_{\text{size}} \times H_{\text{size}} \) so that habitats can be placed on this lattice as shown in the Fig. 3.
According to formulas (6) and (7), habitat location is defined as row and column (r, c), and represented as \( H_{r,c} \), \( r,c = 1,2,3 \ldots H_{\text{size}} \) in lattice, so the neighbors of a habitat are defined as follows:
Figure 4 shows each habitat along with its four neighbors, which forms a small local environment that can only sense the environment. As already mentioned, habitats compete or cooperate with their neighbors to achieve the final goal, i.e. the best or near the best result. Given that a habitat only senses its local environment, competition and cooperation can only occur between the habitats and their neighbors. This helps to spread the information to the entire lattice. Given that competition and collaboration are done locally, the information transmission is slow across the habitat lattice.
4.1 Habitats selection strategies used by MAMHOA
The following procedure is performed to achieve better solutions more quickly and accurately. Initially, the habitats compete with each other and with their neighbors based on their competencies. Moreover, habitats use the self-learning operator to improve their ability (Figs. 3, 4).
The competition operator is performed on the habitat located at (r, c), \( H_{r,c} = (h_{1} ,h_{2} , \ldots ,h_{n} ). \) we consider that the maximum cost function is \( \max_{r,c} = (m_{1} ,m_{2} , \ldots ,m_{k} ),k = (1,2, \ldots ,n) \) and \( H_{r,c} \) is the habitat neighbor with the maximum habitat suitability index (HSI). Therefore, given the neighbor relations shown in Fig. 4, the best habitat conditions are as follows: \( \max_{r,c} \in {\text{Neighbors}}_{r,c} \) \( \forall \alpha \in {\text{Neighbors}}_{r,c} \)\( f(\alpha ) \le f({ {\max} }_{r,c} ) \). If the habitat suitability index is represented as \( {\text{HSI}}(H_{r,c} ) \ge {\text{HSI}}(\max_{r,c} ) \), then \( H_{r,c} \) is the winner because its cost function value is greater than the maximum cost function of its neighbors, so it can still live in the habitat lattice and its position does not change in the search space. Otherwise, \( H_{r,c} \) is the loser and the position of \( H_{r,c} \) in the habitat lattice should be occupied by the new habitat \( ({\text{New}}_{r,c} ) \). The proposed algorithm presents two strategies to select the new habitat.
-
1.
In the first strategy, when \( H_{r,c} \) is the loser, it has still useful information to obtain good solutions. Hence, this strategy as a kind of heuristic crossover involves using the information of loser habitat as well. The new habitat, represented by \( {\text{New}}_{r,c} = (e_{1} ,e_{2} , \ldots ,e_{n} ) \), is calculated using formula (8) where rand (0, 1) is a random number in [0, 1] and \( X_{{\max} } ,X_{{\min} } \) are the maximum and minimum HSIs of habitats, respectively.
-
2.
In second strategy, \( \max_{r,c} \) is first mapped to [0, 1], using formula (9) and new habitat neighbor, i.e.\( {\text{New}}_{r,c} \) is calculated using formula (10).
If \( 1 < r_{1} < n \), \( 1 < r_{2} < n \) and \( r_{1} < r_{2} \). Finally, \( {\text{New}}_{r,c} \) is obtained through mapping \( New_{r,c} \) back to \( [X_{{\min} } ,X_{{\max} } ] \) by formula (11).
As described above, the first strategy increases exploitation and the second strategy increases the exploration. The \( p_{c} \) probability parameter determines which of two strategies should be used. That is, if \( {\text{rand}}(0,1) < p_{c} \), then the first strategy is used, otherwise the second strategy is used. The algorithm in Fig. 5 presents the competition between self-learning operators in MAMHOA algorithm to create best solution. Table 1 shows the initial values of the used parameters in MAMHOA algorithm (Fig. 5). The initial parameters are based on original BBO parameters (Simon 2008).
As shown by the algorithm in Fig. 5 and Table 1, the variables and operators used in the proposed algorithm are as follows: Crossover probability (Pc), Mutation probability (Pm), Habitat Number (Habitats), Lower limit of decision variables (LB), Upper limit of decision variables (UB), Self-learning Operator (SR). According to algorithm in Fig. 5 and Figs. 3 and 4, initially a \( H_{\text{size}} \times H_{\text{size}} \) lattice is created. Then, each c and r location is assigned a habitat. Given the SR probability, the random values are generated and assigned to each habitat in terms of optimum conditions. Then, the habitats are compared in terms of their conditions and the habitat with the highest HSI is selected as the best habitat. Consequently, the migration occurs from a habitat to a new habitat with better conditions.
4.2 Documents summarization processing
In this section, the process of documents summarization using MAMHOA algorithm is shown in Fig. 6. According to Fig. 6\( D = \{ d_{1} ,d_{2} , \ldots ,d_{n} \} \) are input source documents. Each \( d_{i} \) consists of a collection of sentences \( S = \{ s_{1} ,s_{2} , \ldots ,s_{n} \} \) and each \( s_{i} \) consists of a collection of words \( W = \{ w_{1} ,w_{2} , \ldots ,w_{n} \} \). The summarization steps are shown as follows:
Step 1: Input documents
Reading documents \( D = \{ d_{1} ,d_{2} , \ldots ,d_{n} \} \) including n sentences \( S = \{ s_{1} ,s_{2} , \ldots ,s_{n} \} \) and n words \( W = \{ w_{1} ,w_{2} , \ldots ,w_{n} \} \).
Step 2: Preprocessing
Firstly, we pre-process each document (D) before passing it on to the next step. In pre-processing step the following tasks are performed:
-
1.
Omitting additional characters such as {}, [], …
-
2.
Omitting unessential words such as at, in, on, and, of, …
-
3.
Finding the roots of the words and the verbs used in the main text.
-
4.
Separating words and omitting repeated words.
-
5.
Identifying the end of the sentences and separating the sentences.
-
6.
Calculating the number of words (W) and sentences (S) used in the main text.
Step 3: Weighting sentences and create similarity matrix
After pre-processing (step 2), the sentences extracted are weighted using TFIDF method in formulas (12) and (13) (Ledeneva et al. 2008). According to formulas (12) and (13),\( {\text{freq}}_{ij} \) represents the number of repetitions of i word in j sentence, \( {\text{freq}}_{lj} \) represents l word in j sentence, \( \max_{i} {\text{freq}}_{lj} \) represents the maximum number of i word repetitions in j sentence, N and \( n_{i} \) represent the number of sentences in an input text and the number of sentences, respectively. Then, a weight of each word in sentence \( w_{ij} \) is calculated, using formula (14).
Following the sentences weighting, a similarity matrix including extracted sentences and words are created and weighted, using formula (15). In a similarity matrix, sentences are compared based on their keywords and important words. In formula (15), \( {\text{sim}}(s_{i} ,q) \) is the sentence similarity matrix, \( w_{iq} \) and \( w_{ij} \) are keywords and title weight, and the weight of each word, respectively.
Step 4: Processing in MAMHOA
In this step, as explained in Sect. 4.1, a habitat lattice is created and its parameters are set (Fig. 5). Here, the steps below are followed:
-
1.
Creating a habitat lattice \( H_{\text{size}} \times H_{\text{size}} \) and setting parameters.
-
2.
Assigning similarity matrix including extracted sentences from step 3 to habitats.
-
3.
Evaluating habitat’s conditions in order to find the best habitat (Best summarize).
-
4.
Evaluating new habitat repeatedly if conditions are not met.
5 Experiments and discussions
In this section, to evaluate the performance of the proposed algorithm, several experiments have been conducted whose results are reported below. In Experiment 1, we have compared our proposed algorithm with four non-multi-agent algorithms for 200-word and 400-word documents summarization. In the experiment 2, we have compared the proposed algorithm with several Baseline and state-of-the-art methods on 100-word documents published in the literature on DUC 2002. For all experiments, each reported value is obtained by averaging over 200 runs. In all of the tables of our experiments, the best value of each row or each column is shown in bold.
5.1 DUC 2002 datasets
DUC 2002 is a standard datasets of documents for evaluation in the field of summarization, created by NIST (National Institute of Standards and Technology). In all of the experiments, we chose 9 DUC 2002 topics (D061, D065, D067, D070, D073, D075, D079, D085, and D0105). Totally, 73 documents were chosen for tests. Table 2 presents the characteristics of the topics in DUC2002 dataset, which are used in Experiment 1 and Experiment 2.
5.2 Evaluation measures
To analyze the performance of the proposed algorithm we used Recall-Oriented Understudy for Gisting Evaluation (ROUGE) (Lin 2004). This software includes different packages, which determine the similarity between a summary generated by computer and a summary generated by human. It has been shown that ROUGE is very effective for measuring document summarization. The ROUGE metrics are defined according to different Ns and different strategies, such as ROUGE-N (N: 1, 2), ROUGE-L, ROUGE-S and ROUGE-SU. ROUGE-1 is used to compare the overlap between the system summary and the manual summaries created by human. ROUGE-2 is the measure between the candidate summary and final summary. ROUGE-L computes the ratio of the length of the summaries’ longest common subsequence (LCS) to the length of the reference summary. ROUGE-S uses the overlap ratio of a candidate summary to a set of reference summaries. ROUGE-SU is an extension of ROUGE-S with the addition of unigram as the counting unit. Each of these ROUGE evaluation methods generates three criteria (Recall, Precision, and F-score). In this paper, we report the average of scores generated by ROUGE-1, ROUGE-L and ROUGE-SU values. The criteria of Precision, Recall and F-score are calculated using formulas (16) and (17) and (18) (Abuobieda et al. 2013), respectively. The criterion of Precision in formula (16) is an intersection of extracted summarized sentences and an ideal summary of sentences divided by all the extracted sentences. The criterion of Recall in formula (17) is an intersection of relevant sentences and retrieved sentences divided by all the relevant documents, and F-score in formula (18) is a statistical criterion which is a combination of precision and recall criteria and shows the score for final selected sentences in a summary.
5.3 Experiment 1: Comparison of proposed algorithm with the non-multi-agent algorithms
In this experiment, we compare the performance of the proposed algorithm with four non-multi-agent optimization algorithms i.e. BBO (Mirshojaee et al. 2017), CSOA (Mirshojaei and Masoomi 2015), BFOA (Asgari and Masoumi 2013) and GA (Fattah and Ren 2009). Tables 3, 4, and 5 show the analytical results of Precision (P), Recall (R), and F-score (F) criteria for 200-word text summarization using ROUGE-1, ROUGE-L and ROUGE-SU. In experiment 1, we evaluate the results of F-score criteria of our algorithm with other algorithms.
Figure 7 and graph in Fig. 8 compare the average of F-score criteria of the proposed algorithm with those obtained by other optimization Algorithms. As the results show, the summarization F-score criterion for 200-word document has optimized.
In Tables 4 and 5, the comparison of the Precision, Recall and F-score criteria are shown by ROUGE-L and ROUGE-SU.
Below, Tables 6, 7 and 8 show the analytical results of Precision (P), Recall (R), and F-score (F) criteria for 400-word documents using ROUGE-1, ROUGE- L and ROUGE-SU.
Figure 9 and graph in Fig. 10 show that the proposed algorithm has an optimum performance compared to CSOA and BFOA. According to the graph in Fig. 10, the performance of the proposed algorithm is near that of GA and BBO.
Table 9 and graphs in Figs. 11 and 12 show the results of comparison between the average of F-scores obtained for ROUGE-1, ROUGE-L and ROUGE-SU on 200-word and 400-word documents.
5.3.1 Discussion 1
At experiment 1, we discuss four non-multi-agent optimization algorithms including BBO, GA, CSOA and BFOA, which are, cited approaches in the literature. These algorithms are selected for comparative analysis on DUC 2002 documents. The analytical values of F-score criteria are generated using ROUGE Software. The Tables 3, 4, 5, 6, 7, and 8 shows the results of F-score criteria obtained for ROUGE-1, ROUGE-SU, and ROUGE-L. The Table 9 and graphs in the Figs. 11 and 12 show the averages of F-score values are broken down into ROUGE-1, ROUGE-L and ROUGE-SU for 200-word and 400-word documents, respectively.
From the results obtained from ROUGE-1, ROUGE-SU, and ROUGE-L the following points can be made.
5.3.1.1 200-word documents summarization
-
1.
Based on the results of ROUGE-1 (Table 3), the f-score criteria extracted using the proposed algorithm for documents d061, d065, d070, d075, d079, d085 and d105 are more optimal than those obtained by other algorithms. Given the criterion of ROUGE-1, there is more overlap between the extracted sentences of the summary and the sentences of the original text. CSOA performs better on documents d067 and d073.
-
2.
Based on the results of ROUGE-L (Table 4), the f-score criteria extracted using the proposed algorithm for documents d065, d070, d075, d085 and d105 are more optimal than those obtained by other algorithms. Given the criterion of ROUGE-L, the ratio of the length of the summaries’ longest common subsequence (LCS) to the length of the reference summary has optimized. Algorithm BBO performs better on documents d067, d073 and d061 and GA performs better on document d079.
-
3.
Based on the results of ROUGE-SU (Table 5), the f-score criteria extracted using the proposed algorithm for documents d070, d065, d075, d079 and d105 are more optimal than those obtained by other algorithms. Given the criterion of ROUGE-SU, word redundancy and repeated words have decreased in the summary. Algorithm BBO performs better on documents d073 and d061. GA performs better on document d067 and BFOA performs better on document d085.
5.3.1.2 400-word documents summarization
-
1.
Based on the results of ROUGE-1 (Table 6), the F-score criteria extracted using the proposed algorithm for documents d070, d065, d079, and d075 are more optimal than those obtained by other algorithms. Given the criterion of ROUGE-1, there is more overlap between the extracted sentences of the summary and the sentences of the original text. BFOA performs better on document d073. CSOA performs better on document d067. Algorithm BBO performs better on documents d061, d085 and d105.
-
2.
Based on the results of ROUGE-L (Table 7), the f-score criteria extracted using the proposed algorithm for documents d065, d070 and d079 are more optimal than those obtained by other algorithms. Given the criterion of ROUGE-L, the ratio of the length of the summaries’ longest common subsequence (LCS) to the length of the reference summary has optimized. Algorithm BBO performs better on documents d061, d105 and d085. GA performs better on document d075. CSOA performs better on document d067 and BFOA performs better on document d073.
-
3.
Based on the results of ROUGE-SU (Table 8), the f-score criteria extracted using the proposed algorithm for documents d070, d065, d075, d079 and d067 are more optimal than those obtained by other algorithms. Given the criterion of ROUGE-SU, word redundancy and repeated words have decreased in the summary. Algorithm BBO performs better on documents d061, d085 and d105. BFOA performs better on document d073.
In sum, based on the above discussion as well as Table 9 and graphs in Figs. 11 and 12, the proposed algorithm is generally more efficient in the summarization of 200-word documents compared to that of 400-word documents due to the following: more overlap between the summary and the original text, smaller ratio of the length of the summaries’ longest common subsequence to the length of the reference summary, and a decrease in the redundancy of words.
5.4 Experiment 2: comparison of proposed algorithm with the baseline and state-of- the- art benchmarks
In experiment 2, the efficiency of the proposed algorithm is compared to that of the algorithms discussed in literature review including baseline and state-of-the-art algorithms. In this experiment, the proposed algorithm was applied to 100-word documents. The results of the proposed algorithm were compared to those of other mentioned algorithms in terms of Recall criterion.
5.4.1 Baseline and state-of-the-art extractive summarization algorithms
In this section, we compare our algorithm with the several widely used baseline approaches (Table 10), and seven state-of-the-art algorithms published in literature on the dataset DUC 2002 (Table 11). We compared our work with baseline and state-of-the-art algorithms. The methods are selected based on state-of-the-art accuracy reported by the, which are the most cited approaches in the literature (i.e. between 2010 and 2017). A brief description of the baseline approaches and state-of-the art are shown in Tables 10 and 11.
5.4.2 Comparative analysis on DUC 2002 dataset
Table 12 show the average of Recall obtained for the summarization of 100-word datasets of DUC 2002. It can be observed that on these dataset, MAMHOA reports the highest averages for ROUGE-1 (51.0), ROUGE-SU (24.3) and ROUGE-L (47.2) on 100-word documents.
5.4.3 Comparative analysis on DUC 2002 benchmark
The averages of Recall criteria obtained from the MAMHOA are compared to Recall criteria obtained for Baseline and state-of-the-art approaches.
Table 13 and graph in Fig. 13 show the comparative results of efficiency for the proposed algorithm and baseline and state-of-the-art algorithms in terms of Recall, using ROUGE-1, ROUGE-L, and ROUGE-SU.
5.4.4 Discussion 2
The results in Table 13 show ROUGE-1 score of 51.0, ROUGE-SU score of 24.3 and ROUGE-L score of 47.2 on this dataset. As a result, our proposed algorithm (MAMHOA) has a better performance than many of the existing state-of-the-art text summarizers on DUC 2002 dataset with respect to ROUGE-1, ROGUE-SU and ROUGE-L scores such as ILP, graph-based approaches, Tgraph, Egraph, URANK, and even those, which are based on supervised learning such as SummaRuNNer and NN–SE. Our approach cannot surpass the ROUGE-1 score of a recently proposed CoRank approach, though our ROUGE-SU and ROUGE-L scores, which are better evaluation measures than ROUGE-1, are far better than the CoRank method.
The graph in Fig. 13 Show the comparison between the averages of Recall criterion obtained from the proposed algorithm with Baseline algorithms for 100-word documents. As shown in Fig. 13 and Table 13, the performances of proposed algorithm is better than Baseline and stat-of-the-art mentioned algorithms.
6 Conclusion
In this work, we have discussed an optimization algorithm named MAMHOA with two strategies for extractive text summarization to find the best or near the best optimum summary. The proposed algorithm is a hybrid of a biogeography meta-heuristic algorithm and multi-agent systems. The proposed algorithm finds the best or near the best summary by creating a habitat lattice and comparing them to find the best conditions. The proposed algorithm was used for an extractive document text summarization issue. Our algorithm was compared with several non-multi-agent optimization algorithms, baseline and state-of-the-art algorithms using DUC2002 datasets. The 100-word and 200-word and 400-word documents were selected for summarization.
In the first experiment, we compared performance of our algorithm with that of some non-multi-agent algorithm on 200-word and 400-word documents and F-score results were obtained. In the second experiment we compared our algorithm with Baseline and state-of-the-art algorithms with 100-word documents and the Recall results were obtained. The experimental results were analyzed, using ROUGE software. Generally, the results show that the proposed algorithm has a better performance than that of other mentioned algorithms so that it can be used for summarizing.
References
Abuobieda A, Salim N, Kumar Y et al (2013) An improved evolutionary algorithm for extractive text summarization. Intell Inf Database Syst LNAI 7803:78–89
Amancio JV, Tohalino R (2018) Extractive multi-document summarization using multi-layer networks. Physica A 503:526–539
Asgari H, Masoumi M (2013) Provide a method to improve the performance of text summarization using bacterial foraging optimization algorithm. Seventh Iran data mining conference
Chen X, Tianfield H (2016) Biogeography-based optimization with covariance matrix based migration. Appl Soft Comput 45:71–85
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words. In: Proceedings of the 54th annual meeting of the association for computational linguistics, Berlin, Germany, pp 484–494
Fang C, Mu D, Deng Z, Wu Z (2017) Word -sentence co-ranking for automatic extractive text summarization. Expert Syst Appl 72:189–195
Fattah A, Ren F (2009) GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput Speech Lang 23:126–144
Feng Q, Liu S, Zhang J et al (2017) Improved biogeography based optimization with random ring topology and Powell’s method. Appl Math Model 41:630–649
Ferreira R, de Souza Cabral L et al (2013) Assessing sentence scoring techniques for extractive text summarization. Expert Syst Appl 40(14):5755–5764
Haghighi A, Vanderwende L (2009) Exploring content models for multi-document summarization. In: Proceedings of human language technologies: the 2009 annual conference of the north American chapter of the association for computational linguistics, Boulder, Colorado, pp 362–370
Harish G (2015) An efficient biogeography based optimization algorithm for solving reliability optimization problems. Swarm Evolut Comput 24:1–10
Khishe M, Mosavi MR, Kaveh M (2017) Improved migration models of biogeography-based optimization for sonar dataset classification by using neural network. Appl Acoust 118:15–29
Ledeneva Y, Gelbukh A, García-Hernández RA (2008) Terms derived from frequent sequences for extractive text summarization. In: Gelbukh A (ed) Computational linguistics and intelligent text processing. CICLing 2008. Lecture notes in computer science, vol 4919. Springer, Berlin, Heidelberg, pp 593–604
Li Z, Jing L (2016) A multi-agent algorithm for community detection in complex network. Physical A 449:336–347
Lin C-Y (2004) ROUGE: a package for automatic evaluation summaries. In: Proceedings of the workshop on text summarization branches out, Barcelona, Spain, 25–26 July 2004, pp 74–81
Lohokare MR, Pattnaik SS, Panigrahi BK et al (2013) Accelerated biogeography-based optimization with neighborhood search for optimization. Appl Soft Comput 13:2318–2342
Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2(2):159–165
Mihalcea R, Tarau P (2004) TextRank: bringing order into texts. In: Proceedings of the 2004 conference on empirical methods in natural language processing, Barcelona, Spain, pp 404–411
Mirshojaee SH, Masoumi B, Zeinali E (2017) Biogeography-based optimization algorithm for automatic extractive text summarization. Int J Ind Eng Prod Res 28(1):75–84
Mirshojaei SH, Masoomi B (2015) Text summarization using cuckoo search optimization algorithm. J Comput Robot 8(2):19–24
Nallapati R, Zhou B, dos Santos CN, Gulcehre C, Xiang B (2016) Abstractive text summarization using sequence-to-sequence RNNs and Beyond. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, pp 280–290
Nenkova A, Vanderwende L (2005) The impact of frequency on summarization. Microsoft Research. Technical report MSR-TR-2005-101
Nenkova A, Maskey S, Liu Y (2011) Automatic summarization. In: Proceedings of the 49th Annual meeting of the Association for Computational Linguistics: Association for computational linguistics vol. 3, pp 1–3:86
Oufaida H, Nouali O, Blache P (2014) Minimum redundancy and maximum relevance for single and multi- document Arabic text summarization. J King Saud Univ Comput Inf Sci 26(4):450–461
Paraskevopoulos A, Dallas PI, Siakavara K et al (2017) Cognitive radio engine design for IoT using real-coded biogeography-based optimization and fuzzy decision making. Wirel Pers Commun 97:1813–1833
Parveen D, Ramsl H-M, Strube M (2015) Topical coherence for graph-based extractive summarization. In: Proceedings of the 2015 conference on empirical methods in natural language processing, Lisbon, Portugal, pp 1949–1954
Radev D, Erkan G (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Rautray R, Balabantaray RC (2018) An evolutionary framework for multi document summarization using Cuckoo search approach: MDSCSA. Appl Comput Inf 14:134–144
Simon D (2008) Biogeography based optimization. IEEE Trans Evolut Comput 12(6):702–713
Steinberger J, Jezek K (2004) Using latent semantic analysis in text summarization and summary evaluation. In: Proceedings of 7th international conference ISIM’04, pp 93–100
Wan X (2010) Towards a unified approach to simultaneous single-document and multi document summarizations. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), Beijing, China, pp 1137–1145
Woodsend K, Lapata M (2010) Automatic generation of story highlights. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Uppsala, Sweden, pp 565–574
Yang J, Li L (2017) Improved biogeography-based optimization algorithm for mobile robot path planning. Chin Intell Syst Conf 2:219–230
Zhang D, Kanga Q, Chenga J et al (2018) A novel hybrid algorithm based on Biogeography-Based Optimization and Grey Wolf Optimizer. IEEE Trans Evolut Comput 67:197–214
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mirshojaee, S.H., Masoumi, B. & Zeinali, E. MAMHOA: a multi-agent meta-heuristic optimization algorithm with an approach for document summarization issues. J Ambient Intell Human Comput 11, 4967–4982 (2020). https://doi.org/10.1007/s12652-020-01776-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-01776-8