Keywords

1 Introduction

Software Engineering is an application of the systematic and disciplined approach to the design, development, operation and maintenance of software [31], the detection of software vulnerabilities is a critical step to ensure the quality and security of the software [28] however, software testing is a time-consuming and costly task, consuming almost 50% of the resources for the development of the system software [27, 29]. Automated software testing is better than manual testing, however, very few test data generation tools are currently available commercially [6].

Genetic algorithms (GA) are formed from the evolutionary algorithms conceived by John Holland in the United States well known during the late sixties [7], these have been widely used in software engineering as a method of optimization [13]. Consequently, this research focuses on the technology and application of genetic algorithms in the problems of software engineering.

This systematic review is based on the protocol proposed by [16, 17, 22]: According to the information that exists on the application of genetic algorithms in software engineering, it was targeted: “Specify the most recent research about the applications of genetic algorithms in software engineering” as a guide for this review. Taking into account that in the phases of software development a greater optimization and resources are needed, the following research questions (RQ) have been determined:

  • RQ1: What problem does the genetic algorithm solve in software engineering?

  • RQ2: What application and technology do genetic algorithms have in Software Engineering?

In Sect. 2, the SLR is executed, the result of which is described in Table 2. On the basis of these results, Sect. 3 presents the most notable details and the synthesis argued and discussed in the 20 primary studies and Sect. 4 concludes as research questions the consequences of the review, and specific lines of research for the future.

2 Review Protocol Development

2.1 Research Identification

The criterion for the choice of search sources was based on web accessibility and the inclusion of search engines that allow to carry out advanced queries, in this way the following were used: ACM [2], IEEE library [1], SCOPUS Library [11] and RRAAE [23].

For the choice of keywords it was considered: research questions and keywords of previously reviewed articles: optimization, software engineering, genetic algorithms, genetic programming, evolutionary algorithms, requirements and testing.

Searches were performed using logical operators: (AND) and (OR) and the following inclusion criteria were considered for the search:

  • Take as relevant current publications since 2012.

  • Search results in the area of science and computation.

  • Documents in Spanish and English language.

  • Search the Abstract of the article for keywords.

The Table 1 correspond to the search chains in the different bibliographic sources.

Table 1. Bibliographic sources and search strings.

2.2 Selection of Primary Studies

Once the results were obtained with the searched questions, the criterion that will be followed in the execution of the review for the selection and evaluation of primary studies was described. The results of the search that have not been relevant to the stated objective have been discarded taking into account the following exclusion criteria:

  • Studies that do not contain information that helps answer RQ1 and/or RQ2 research questions.

  • In the summary and content there is no information about the application of the algorithms in software engineering.

  • Work that is poorly structured and unclear.

  • The conclusion must have relevant information for the investigation.

2.3 Data Extraction

The Table 2 presents the relevant information for each of the selected articles (S01...S20) according to the search by pointing out elements such as: (RQ1) Problem that genetic algorithms solve in software engineering and (RQ2) Application and technology of genetic algorithms in software engineering.

Table 2. Data extraction from the primary studies.

2.4 Data Synthesis

Once the primary studies have been determined, it can be observed that in the present review, 127 have been taken into account for the analysis, of which the primary ones were considered, 20 of which the following synthesis is illustrated (Table 3):

Table 3. Summary of reviewed studies.

In Fig. 1, the incidence of the studies that have been analyzed is shown, in this table it is shown that only one of them S01 [32] has maintained a very significant influence and takes as a direct reference for another 13 studies. S15 and S16 show that the impact that has been obtained has been medium since they have been taken into account more than 4 times and in the rest of the studies the impact they have shown has been low since they have only been considered in less than 4 studies and in three cases S08, S09 and S13 the incidence has been null since they have not been considered for other references.

Fig. 1.
figure 1

Synthesis by impact

Figure 2 shows the direct participation of the authors in the different studies that were analyzed, they are the most prominent, as you can see the authors have not collaborated in another study different from the one mentioned in each one of the sections. Taking into account the results shown, it can be concluded that the topics are related and the results obtained are as desired, so it has not been considered feasible to carry out a second study to corroborate the data obtained in it.

Fig. 2.
figure 2

Synthesis by author.

Fig. 3.
figure 3

Synthesis of the AG applications.

Figure 3 shows clearly that the main application is in the testing phase of a project, and these can be regressive, initial or terminal. It is also shown that these algorithms can be applied in the management of projects, which helps to optimize the overall level of the project as indicated by 4 of the studies, regarding the Production, Distributed Systems and Software Engineering based on components, only the application of these algorithms has been tested in a single project for each one of them, therefore they are not considered as relevant study points for the application of genetic algorithms.

Figure 4 shows the technology applied by the genetic algorithms, it is denoted that Java is used mostly for development, not only because it is a free programming language, it is also multiplatform and of greater boom. in the development. Another of the technologies is C++, the studies that were analyzed showed that several of them used this language for the programming of these applications. A large percentage of these studies have not implemented any development technology, since they were only applied at some design or initial stage. The Matlab, OLC and Web technologies have been applied only in two of the studies analyzed each, so it can be defined that these are not very relevant or have little boom in the development of applications, the Most development technologies that were used are object oriented and free guidelines.

Fig. 4.
figure 4

Synthesis of technology.

3 Discussion

S01 and S08 agree that the test in the Software Product Lines can be minimized appropriately using genetic algorithms since the total number of test cases is reduced, improving its efficiency.

S02, S14 and S17 propose to the Genealogical Algorithms as a solution for the automatic generation of tests, since the software development process invests at least 50% of the total cost in the testing process. software.

S03, S04, S13, S15 and S18 agree that the benefits of software reuse multiply if carried out in the early stages of software development. Sequence diagrams are commonly used to model the functionality of software systems in the early stages of the software development life cycle.

In S05, S06 and S10 they emphasize that the cases of prioritization tests is an essential task that reduces the test effort in the maintenance phase to a considerable degree. These articles propose a framework for the prioritization of test cases using a tool based on a genetic algorithm, developed in the Java language.

S07 and S12 agree that the identification of faults in the very early phase of life cycle software development is very necessary. This helps software developers focus more on quality assurance, use the workforce in the right perspective, and reduce the cost of debugging software development in particular.

S11, S16, S19 and S20 agree that the deadline, limited programming in project management, is a problem of optimization with greater relevance in software engineering and other real-life situations. they are responsible for the planning of the activities that must be completed before the specified dates to resolve the programming period with limitations in the management of projects. The genetic algorithms have been designed to calculate precise solutions in the times of reduced execution.

4 Conclusion and Future Work

Regarding the technology used, it is noted that the highest percentage of studies carried out are based on Java, a programming language for free and object-oriented guidelines, which is used at all levels of programming. n. On the other hand a large number of the works: S04, S13, S115, S18, S19, were still in stages of study and many of these have been made in their development process in UML or in its initial phases.

There are primary studies S02, S03, S06, S07, S10, S12 that are specific and offer a clear answer on the problems that the application of General Algorithms has solved. Likewise, the application of these algorithms allowed the optimization of several of the stages in the development of the Software Engineering, in addition to the technologies in which a software can be executed. Genetic algorithm are not limited or do not show restrictions on those that currently exist. Based on the primary studies S02, S04, S09, S11, S15 analyzed there are several scopes in the application of genetic algorithms in the Software development, most of these are shown in the design and testing stages, which mainly streamlines the control of programming errors and optimizes the time and costs that are generally the stages in the which concentrates most of the effort and budget of a project.

Finally, the applications of genetic algorithms are of greater height according to their timeline, denoting a great use in initial stages or tests, since the results that have been obtained in each of these have clearly shown that in all they have achieved the optimization of processes, which demonstrates the improvement of the cost-time-effort ratio. However, the need for applications in the software life cycle is present, when this process is caused by frameworks and agile methodologies that involve great interaction with the user.