Keywords

1 Introduction

Scheduling generation is a complex challenge faced by various institutions and organizations in their day-to-day operations, including academic, professional fields, and other contexts where coordinating and allocating resources is necessary.

The problem lies in the need to find an appropriate combination of elements and constraints to meet the demands and needs of all involved parties. It is crucial to ensure that schedules are fair and equitable, avoiding situations where individuals or groups are disadvantaged in terms of workload, class distribution, shifts, or any other relevant criteria.

In the Mechatronics Engineering school at the International University of Ecuador, scheduling for courses and activities presents a complex challenge due to the variety of resources, constraints, and preferences of students and professors. Efficient schedule planning is crucial to ensure a fair distribution of subjects, avoid conflicts, and optimize resource allocation. To address this problem, the optimization of a genetic algorithm (GA) is proposed, aiming to find optimal or near-optimal solutions using techniques inspired by evolutionary computation.

Currently, various research studies are related to automatic schedule generation using GAs and other optimization techniques. These approaches have proven to be effective in solving resource allocation and scheduling problems in different academic contexts, which could be classified into two categories: classical genetic algorithms ([1,2,3,4]) and hybrid genetic algorithms ([5,6,7]).

A common problem found in the explored works is the adjustment of the involved parameters, such as population size, mutation rate, and the number of generations, which can significantly affect the performance of the algorithm for automatic schedule generation. To address this issue, techniques like parameter optimization can be employed to find the required parameter values.

The project aims to contribute to the generation of optimal schedules in the field of Mechatronics Engineering at the International University of Ecuador. To achieve this, the development of a GA is proposed, employing an appropriate encoding to represent schedules and using genetic operators such as selection, reproduction, and mutation to generate new solutions. Evaluation and fitness techniques will be applied to assess the quality of each solution, and multiple iterations of the algorithm will be performed to gradually improve the results. The goal is to obtain optimal or near-optimal schedules that meet all the established constraints and preferences, thereby enhancing the efficiency of course planning and resource allocation.

The obtained results so far support the feasibility and effectiveness of the proposed GA for schedule generation in the field of Mechatronics Engineering at the International University of Ecuador, resulting in reduced execution time and improved solution quality. The algorithm demonstrated positive outcomes, including lower average execution time, fewer required generations to find a solution, increased stability, and convergence towards optimal solutions. Our results are attributed to a higher number of elite schedules, increased mutation rate, and random class selection during mutation, leading to improved stability and reduced computational resources required.

This document is organized as follows: In Sect. 2, a brief overview of the current state of research is provided. Section 3 includes the description of the database used for experimentation, along with the definition of parameters and characteristic operators of the GA. The experimentation process and the metrics employed to evaluate the GA’s performance are described in Sect. 4. The obtained results and the various comparisons conducted are presented in Sect. 5. Finally, Sect. 6 gathers the concluding remarks.

2 Related Works

The use of GAs in timetable generation has become an active and constantly evolving area of research, with numerous practical applications and possibilities for future development.

In the field of optimization, GAs have proven to be a powerful tool for solving complex problems. In particular, the generation of schedules is a task that can be solved using GAs and that has been the object of investigation in several studies. For instance, [1] and [8] described the implementation of a GA to generate university timetables with the goal of producing accurate solutions, although its implementation can be complicated and may require exhaustive tuning and testing. Authors in [9] described the application of a GA for teaching planning in a university, which, like the previous ones mentioned, can generate precise and efficient solutions, but its implementation may require a large amount of computing resources. Starting from the latter in terms of the limitation presented, it is similar to the one described in [6], with the difference that the work involved a hybrid GA to program the schedules of nurses in a hospital considering the nurse fatigue.

In the field of assigning academic timetables, various investigations have been carried out by using GAs. Among them are ’Assignment of academic schedules for the School of Civil Engineering in Computing of the University of Talca [10], which described the implementation of a GA to assign schedules in a school engineering. The advantage of this approach is that it can generate accurate solutions in a reasonable amount of time, although it may require adjustment to suit the specific needs of an engineering school. Other relevant articles presented in [11] and [12] described the planning of university timetables using GAs with an approach similar to [10], although with a complex implementation of the algorithm, which can require a large amount of computing resources as [9].

Authors in [3, 13] reported the application of a GA to generate exam schedules in a university, with precise and optimal solutions, although its algorithm can be difficult to fit in. The work introduced in [13] presented the generation of schedules in a highschool with time efficient solutions, but with an algorithm that may require tuning and testing. Both works are relevant to this project, as they are part of the main literature to improve the selected algorithm.

In [5], the authors proposed a hybrid approach that combines a GA with a greedy approach to generate school and exam timetables efficiently but with a limitation on the greedy approach that can lead to sub-optimal solutions. Authors in [2, 7] and [14] introduced an intelligent system based on a GA to plan schedules in various educational fields with optimal time solutions. With the limitation of being difficult to adjust the algorithm to adapt to the specific needs of the field to be applied. The GA introduced in [15] yield school timetables involving advanced technical knowledge.

Moreover, in the medical field, authors in [16] presented an appointment scheduling system for medical treatment using GAs presenting efficient solutions in terms of the use of time and medical resources, but using a large amount of information resources.

Based on the way of operating the works explored from the literature, we have identified two categories: 1) Classical GAs, which are capable of handling large search spaces and allow high-quality solutions to be found in a reasonable time, but can fall into local minima and do not always find the global optimal solution. Within this category we can find some articles such as [1, 2] and [3]. 2) Hybrid GAs, which combine different techniques to improve the precision and efficiency in solving the problem, as well as they can improve the quality of the solutions found by classical genetic algorithms and reduce the time needed to find them, but their implementation may require more effort and technical knowledge than conventional GAs. Some of the articles categorized in this second point are [5, 6] and [7].

3 Materials and Methods

3.1 Data

The database collects the information from seven semesters of the School of Mechatronics of the International University of Ecuador, as is shown in Table 1.

Table 1. Description of the data that will be used.

Each of the chosen parameters are described in a general way, in [17] you can find the database in detail.

3.2 Proposed Algorithm

The proposed GA depicted in Fig. 1 was inspired by the work titled “Development of an exam scheduling system using genetic algorithm” [3], and aims at finding the best assignment of available subjects and resources.

Fig. 1.
figure 1

Flowchart of the proposed genetic algorithm to generate the academic schedules.

Database: Within the database there are 5 academic objects: subjects, semesters, classrooms, teachers, and meeting schedules; these data will be imported into the GA as arrays.

Initialization of Population: A initial population of 9 random schedules based on the 5 aforementioned objects were created. This initial population served as a starting point for the search and optimization process that follows in the GA.

Fitness Calculation: The goal was to find the optimal schedule removing conflict with the objects that compose it. These conflicts can arise from conflicting reservations of professors or classrooms, classroom capacity, and professor availability. When the number of conflicts was zero, the problem was considered solved.

Parent Selection: If a schedule with zero conflicts was not found, parent selection by tournament was performed. This step selected randomly three schedules from the population in order to find the two best parents for the next generation.

Crossover: Each class in the new schedule, made up of the 5 aforementioned objects, was crossed with a randomly selected class from the parent schedules, thus producing reproduction through a crossing point.

Mutation: For schedules in the population that are not part of the elite, random classes are mutated with a probability determined by the mutation rate. If a mutation occurs, the selected class is replaced with a random class from another schedule generated by the population initialization method.

Elitism: This ensured that the best fitness schedule obtained so far was kept in the population and transmitted to future generations without changes, avoiding the loss of high-quality solutions.

3.3 Changes Applied to [3]

We conducted a set of experiments to improve the overall performance of the work proposed in [3], which involved adjustments to the elitism parameter and mutation rate, as well as a modification to the mutation operator technique.

The number of elite schedules was increased from 1 to 2. This means that the two best schedules from each generation will be preserved in the next generation. The mutation rate was increased from 0.1 to 0.2 aiming at promoting greater solution diversity.

Regarding the mutation technique, the probabilistic approach to determining which classes will be mutated was maintained. However, a change was introduced in the mutation process. In [3], mutation was performed sequentially, that is, class by class. In this work, we maintained the ordered sequence of the algorithm that will be mutated, but a random position within the schedule to be introduced was chosen. With these changes, it was expected to explore different schedule combinations and increased solution diversity in the search for improved optimization.

4 Experimental Setup

For experimental purposes, the hardware resources involved an 8-core AMD Ryzen 7 5800X processor, 16 GB of RAM, and an LG 20MP38HQ-B monitor. The software resources involved Windows 10 Pro x64-bit, Python in Google Colab, and Opera web browser. The algorithm was implemented through Python software routines. In the development of the proposed algorithm, the following libraries were employed: time, math, sqlite3, prettytable, random, and enum.

4.1 Metrics

  • Generations: The number of generations required to find the solution to the problem.

  • Execution Time: The execution time is an important indicator of the algorithm’s efficiency in obtaining high-quality schedules within a reasonable time.

  • Overall Entropy: An algorithm that tends to generate similar solutions in each iteration is less desirable than one that produces diverse solutions. The diversity of the population was measured using the population’s entropy, represented by (1).

    $$\begin{aligned} H = -\sum (\rho \cdot {\log _2} (\rho )) \end{aligned}$$
    (1)

    where:

    • \(\rho \) proportion of each type of solution in the population;

    • \(\log _2\) logarithm base 2.

    The values of H can range from 0 to \(\log _2(N)\), where N is the size of the population. A value of H close to 0 indicates a homogeneous population, while high entropy indicates a more diverse population.

  • Overall Convergence: Convergence refers to how quickly the genetic algorithm finds an optimal solution or one close to it. The convergence speed was measured by the fitness improvement rate represented by (2).

    $$\begin{aligned} \begin{aligned} Fitness\; improvement \;rate\; \\ =\frac{current\; generation\; avg.\; fitness-avg.\; fitness\; of\; the \;previous \;generation}{avg.\; fitness\; of\; the\; previous \;generation} \end{aligned} \end{aligned}$$
    (2)
  • Average Fitness: It measures how well a schedule meets the established requirements and constraints, i.e., its quality. In this case, the average fitness of each generation was calculated, and then an overall average was obtained.

4.2 Parameters of the Evolutionary Algorithm

Table 2 includes the comprehensive set of parameters utilized in our algorithm. These parameters hold significant importance in driving the optimization process effectively. The parameter values were carefully established experimentally and were also drawn from literature as shown the column value selection protocol in Table 2.

Table 2. Parameters values used in the proposed algorithm.

The code is available at the following link: source code.

5 Experimental Results

5.1 The Proposed Algorithm vs. [3]

The increase to two elite programs permitted our algorithm to retain high-quality solutions from one generation to the next one, accelerating the convergence of the algorithm. Although there could have been an effect on the population entropy due to the repetition of solutions, this problem did not arise.

The increase in the mutation rate, specifically from 0.1 to 0.2 permitted our algorithm to explore more solutions. It is important to note that a high mutation rate can make it difficult to find the solution and generate less stable or suboptimal solutions. However, in this particular case, this problem was not experienced.

The change applied to the mutation operator favored greater diversity and increased the chances of finding optimal solutions. Although there is a possibility of generating less coherent schedules or schedules that do not meet certain restrictions, it was established that the random position should be within the allowed values in the schedule, thus avoiding this situation.

In summary, the changes proposed in our algorithm provided benefits such as faster convergence, wider exploration of solutions and greater diversity, without generating significant problems in solution quality or schedule coherence.

Table 3 and 4 depict the results obtained from 10 iterations of each algorithm. In order to facilitate the comprehension of the outcomes for each scenario, Fig. 2 provides a visual comparison of both algorithms across various metrics.

Table 3. Overall performance of the algorithm presented in [3] through 10 executions.
Table 4. Overall performance of the proposed algorithm through ten executions.
Fig. 2.
figure 2

Overall performance of the proposed algorithm in comparison with [3].

5.2 Discussion

From the obtained results Fig. 2 and Table 4, it can be observed that the proposed algorithm achieved a significant improvement in terms of the evaluated metrics. Firstly, it reduced the average algorithm execution time, which led to a decrease in the average number of generations required to find the solution. In [3], a maximum of 193 generations and a minimum of 9 generations were required, while in our algorithm, these values were reduced to a maximum of 10 generations and a minimum of 4 generations as can be seen in Table 3 and 4.

In addition, the obtained results depicted in show greater stability in the search for the solution in the optimized algorithm compared to the original as can be seen in Fig. 2. In the original algorithm, there were cases where the results were considerably dispersed, while in the optimized algorithm, this dispersion was significantly reduced.

Regarding entropy, it remained at the same level as in the original algorithm since it was already at its maximum value as shown in Fig. 2c. Therefore, the changes made did not affect this aspect.

In terms of the overall convergence of the algorithm, the optimized algorithm has achieved similar convergence to the original algorithm but in fewer generations as seen in the Fig. 2d. This means that the solutions obtained in each generation were optimal and approached closer to the final solution as the execution of the algorithm progressed.

In both algorithms, a very good average fitness has been obtained with an almost linear trend Fig. 2e. This indicates that the algorithm has been effective in improving solutions generation after generation.

In summary, the experiment has demonstrated an improvement in evaluated metrics such as execution time, number of required generations, result in stability, and quality of obtained solutions. These results support the effectiveness of changes implemented in genetic algorithms.

6 Conclusion

In conclusion, this work aimed to optimize a GA for scheduling, seeking to reduce execution time and improve the quality of obtained solutions. The results obtained were very positive, showing a clear improvement in evaluated metrics. A significant reduction was achieved in the average execution time and the number of generations required to find the solution, as well as greater stability and convergence towards optimal solutions in fewer generations. The main advantage of the proposed algorithm lies in the combination of a higher number of elite schedules preserved in each generation, a higher mutation rate, and a random selection of classes during mutation, achieving greater stability and reduction of computational resources compared to the original algorithm. This allows a greater diversity of solutions and a more efficient search. However, a possible disadvantage of the algorithm could be its lack of generalizability to other databases since the algorithm would have to be adapted to the constraints of the new data which may imply changes in the implemented code.

As a future improvement, it is suggested to consider incorporating crossover operators into the proposed genetic algorithm. This would allow combining features of elite schedules and exploring new solutions, which could further increase diversity and quality of obtained solutions. Overall, the obtained results and possible future improvements highlight the effectiveness and potential of the proposed genetic algorithm in scheduling.