1 Introduction

The timetabling of the final examinations is a difficult and crucial task that every university department has to solve regularly. Many versions of the examination timetabling problem (ETTP) have been proposed in the literature, as any educational institution has its own rules and practices. They range from very simple ones, in which only examination conflicts are taken into consideration, so that they turn out to be simple extensions of the graph coloring problem (see, e.g., Carter et al., 1996), to extremely complex ones that include room constraints, preferences, heterogeneous timeslots, and many other practical features (see, e.g., McCollum et al., 2007).

In this paper, we consider the real-world version introduced by Battistutta et al. (2020) that applies to Italian universities, which includes composite examinations, preferences and unavailabilities for periods and rooms, examinations in multiple rooms, and many other features. In addition, conflicts are based on predefined curricula, rather than on student enrollments as more commonly done. The original formulation of Battistutta et al. (2020) has been refined to better capture a few real-world practices and to correct some semantic issues.

For this problem, we propose a portfolio of solution techniques comprising both exact and metaheuristic methods. Regarding the exact techniques, we developed two mixed integer programming (MIP) models and a constraint programming (CP) model, where the latter model is an enhanced version of the one originally proposed by Battistutta et al. (2020). On the metaheuristic side, we improved the simulated annealing (SA) proposed by Battistutta et al. (2020), by developing a new neighborhood relation.

Our search methods have been tested and compared among themselves and with the method by Battistutta et al. (2020), providing insights into the performances of the different methods.

All instances and best solutions are available at https://bitbucket.org/satt/examtimetablinguniuddata for inspection and future comparison, in both JSON and dzn (MiniZinc data files) formats. Furthermore, to encourage future contributions from other researchers, we have also developed and made publicly available a solution checker to protect against possible misinterpretations of the data and the constraints. Thanks to this tool, we have also detected and corrected some discrepancies between the JSON and dzn formats in the original files of Battistutta et al. (2020).

The paper is organized as follows. The problem formulations are provided in Sect. 2. Section 3 is dedicated to the discussion on the related work. Our search methods are introduced in Sect. 4. The experimental analysis is illustrated in Sect. 5. Finally, conclusions and future work are discussed in Sect. 6.

2 Problem formulation

Our problem consists of scheduling the final examination of a set of courses in a given time horizon. Within the horizon, we have to schedule one or more examination of the same course. Each examination represents a fully independent round (or call), so that students have multiple chances to pass a course. In turn, each examination can be composed of one or two parts (written and oral), called events.

Events must be assigned to a period, which represents a timeslot and a day of the time horizon. Events take place in rooms, which are classified according to their size. A single event might require more than one room. Events might also require no room at all so that they are conventionally assigned to the dummy room. In particular, in the oral part, students are questioned sequentially one at the time, so that it can be done also at the teacher’s office.

Courses are grouped into curricula that determine conflicts and separations between their examinations. Each curriculum has a set of primary and secondary courses, which imply different levels of conflict and different distances among examinations.

In this formulation, we do not consider the enrollments of students to the specific examinations, as this information is not known at the time of the creation of the timetable. Therefore, the only sources of conflicts are the curricula and the presence of the same teacher for two courses.

In detail, the main entities of the problem are the following:

  • Courses, examinations, & Events For each course, there are one or more examinations. Each examination might be a single event (either written or oral) or composed by two events: the written part (first) and the oral part (second), to be scheduled in this strict order.

  • Rooms Events require zero, one, or more rooms. Rooms are classified as small, medium, or large, and for each written event, we set the number and the size of rooms requested. (Mixed requests are not allowed.) The sets of rooms that can be assigned to a single event are called composite rooms, and they are listed in the input data. Composite rooms are homogeneous, such that all rooms of the set have the same size. An event may also use a room larger than it requested, but this does not apply to composite rooms. Oral events might require at most one single room (of any size).

  • Days, Timeslots, & Periods The time horizon is divided into days, and each day is split into timeslots. (The same number of timeslots are given in each day.) A period is a pair day/timeslot.

  • Curricula Curricula are sets of courses that have students in common, which therefore undergo the corresponding examinations. The set of courses of a curriculum is classified in primary courses and secondary ones, according to the expected number of students of the curriculum that has to take the corresponding examination. As explained below, the level of conflict between primary and secondary examinations of a curriculum is different.

The problem consists in assigning a period and a room (single, composite, or dummy) to each event, satisfying the hard constraints and minimizing the violations of the soft constraints.

The hard constraints are the following:

H1.:

RoomRequest: Rooms assigned to an event must be of the correct size and quantity.

H2.:

RoomOccupation: There must be at most one event per room per period.

H3.:

HardConflicts: Two events must have different periods if they are in hard conflict. Two events have a hard conflict in the following cases: (i) they are part of courses that are both primary courses of one curriculum; and (ii) they have the same teacher.

H4.:

Precedences: If one event precedes another, the first one must be scheduled before the second one. Two events have a precedence constraint in the following cases: (i) they are written and oral parts of the same examination; and (ii) they are part of two consecutive examinations of the same course.

H5.:

Unavailabilities: Events cannot be assigned to an unavailable period or an unavailable room. Unavailabilities are explicitly stated for each specific event in the input data.

The objectives (soft constraints) are the following:

  • S1. SoftConflicts: Two event should have different periods if they are in soft conflict, i.e., if they belong to courses that are in the same curriculum, either as primary and secondary or both as secondary.

  • S2. Preferences: Events should not be assigned to an undesired period or an undesired room. A period can also be specified as preferred (a positive preference), so that in case of preferred periods for an event, all indifferent ones are assumed undesired (and explicitly undesired one are given a larger penalty).

  • S3. Distances: Requested period separations between events should be observed. We have two types of separations: directed, the first event must precede the second one, and undirected, such that they can be in any order. We have a directed distance constraint in the following cases:

    • Same examination: written and oral events of the same examination have a minimum and a maximum distance.

    • Same course: events belonging to consecutive examinations of the same course must be separated by a minimum distance. The separation constraint is applied between the first (or single) part of each of the two examinations.

    We have a undirected distance constraint in the following case:

    • Same curriculum: if two courses belong to the same curriculum, there should be a minimum separation between the examinations (as above, for two-part examinations, we consider the first one). The amount of separation and the weight for its violation depend on the type of the two (primary or secondary) memberships.

The weight of the violation of the various types of soft conflicts is normally set by the end-user. Similarly, all distance limits and the corresponding weights are configurable. For the sake of reproducibility, in this work we fix the weights to the values summarized in Table 1, whereas the distances are listed in the input data.

Table 1 Weights of soft constraints

3 Related work

Examination timetabling is a classical optimization problem that has been extensively studied in the scientific literature (see, e.g., Qu et al., 2009; Schaerf, 1999). Here, we discuss the formulations proposed, with particular attention to those equipped with some public datasets, which can be used to compare search methods.

The two formulations that have received most attention in the literature are the one by Carter et al. (1996) and the one proposed by McCollum et al. (2007) as part of the 2nd International Timetabling competition (ITC2007). These two formulations are rather different from each other, as the first is an extremely simple formulation (basically an extension of graph coloring), whereas the second one is more complex, including several peculiar rules of the British universities. Both formulations are complemented by very challenging datasets composed of 13 and 12 instances, respectively, none of which has been solved to proven optimality so far.

Another dataset comes from the Yeditepe University (Turkey) introduced by Özcan and Ersoy (2005) and subsequently modified by Bilgin et al. (2006), who also extended the Carter dataset by including room capacity (see Parkes and Özcan, 2010 for a discussion on Yeditepe and modified Carter datasets).

A study about real-world ETTP in the Asian context was first introduced and subsequently extended by Kahar and Kendall (2010, 2015), who presented the case of Universiti Malaysia Pahang (Malaysia). Demeester et al. (2012) addressed the ETTP at KAHO Sint-Lieven (Belgium), whereas Müller (2016) showed the application of examination timetabling to a large American university (Purdue University).

Woumans et al. (2016) proposed another interesting formulation, which addressed the problem from a student-centric perspective, considering the possibility of scheduling the same examination more than once if it improves the fairness among different student groups. The authors developed a Column Generation approach applied to a test case at the KU Leuven campus (Belgium). Muklason et al. (2017) carried out an in-depth analysis of fairness in examination timetabling.

More recent activity includes Keskin et al. (2018), Abou Kasm et al. (2019), and Güler et al. (2021) that have presented different specific formulations for real-world ETTPs.

Our formulation (Battistutta et al., 2020) consists of many peculiar constraints and objectives that have not previously been addressed together in the scientific literature. These include one or more examination of the same course, which, in turn, can combine a written and an oral part; examinations split across multiple rooms; and primary and secondary courses for each curriculum, determining different levels of conflicts between the respective examinations.

After discussing the variants of ETTPs proposed in the literature, we move on to discussing the techniques employed for their solution. Many different search methods have been applied, ranging from exact methods, heuristics, metaheuristics, and hybrid techniques.

Table 2 Notation for modeling the problem

The literature on the application of integer programming (IP) for ETTPs is relatively sparse. Some works focus on improved preprocessing and mixed integer programming (MIP) formulations of the examination timetabling problem posed in the ITC2007 (Arbaoui et al., 2015, 2019). Others investigate university-specific problems. Al-Yakoob et al. (2010) consider both examination timetabling and proctor assignment at Kuwait University, defining and solving a MIP for each problem. Cataldo et al. (2017) solve the examination timetabling problem at Diego Portales University (Chile) using IP in four stages. The interesting feature of the latter problem is that, similar to ours, it is curriculum-based, which means that conflicts are defined based on the predefined set of curricula. This is not the case for all other formulations, for which conflicts are expressed on the basis of the actual enrollment of the students to the specific examinations (called post-enrollment timetabling). Recently, Al-Hawari et al. (2020) have implemented a multistage MIP approach to solve the ETTP at German Jordanian University (Jordan), which is validated both on real-world instances and on Carter’s benchmarks.

June et al. (2019) studied the ETTP at the Universiti Malaysia Sabah Labuan International Campus (UMSLIC) and developed a hybrid solution method that integrates CP and SA in two phases: An initial feasible schedule is obtained through constraint programming; then, the quality of the solution is improved by SA. The method was tested on datasets collecting the data of two semesters at UMSLIC (Malaysia). Genc and O’Sullivan (2020) provided a CP model, mainly based on bin-packing constraints, to solve the examination timetabling problem at University College Cork (Ireland).

Metaheuristics have proved to be really effective in solving ETTPs. Current best-known results on Carter’s instances have been obtained by the SA developed by Bellio et al. (2021), the genetic algorithm of Leite et al. (2018), and the Great Deluge approach by Burke and Bykov (2016). Similarly, state-of-the-art solvers for the ITC2007 formulation have implemented variants of SA (Burke and Bykov, 2016; Battistutta et al., 2017; Leite et al., 2019).

4 Search methods

In this section, we describe the search methods developed in our study, namely mixed integer programming (Sects. 4.1 and 4.3), constraint programming (Sect. 4.2), and simulated annealing (Sect. 4.4).

Table 2 shows some common notation used in our MIP and CP models of the problem. In particular, we define the set \(\mathcal {R}\) of all rooms, which includes single, composite, and dummy rooms. On members of \(\mathcal {R}\) , we define the notion of equivalence: Two single rooms are equivalent if the have the same size (small, medium, or large); two composite rooms are equivalent if their members have the same size (composite rooms are homogeneous) and they have the same cardinality. Finally, for composite rooms, we define the notion of overlapping rooms, which are their (single) members and other composite rooms with members in common.

4.1 Mixed integer programming

In this section, we define an integer programming (IP) model for the problem. However, we can relax several of the integer variables, resulting in a mixed integer programming (MIP) model. We generally refer to the model as a MIP for simplicity, although it may be an IP. We use both model variants for computational testing and explicitly state when the model is an IP model. When comparing solution approaches, we designate this method MIP.

We introduce the MIP model in Sect. 4.1.1, and in Sect. 4.1.2, we describe a two-stage decomposition approach using the model which we call MIP \(_\textrm{2S}\).

4.1.1 The MIP model

Table 3 Decision variables of the MIP model

Table 3 shows the core decision variables of the model. We have \(x_{e, p, r}\)  as the main decision variable, determining each event’s period and room assignment. We include auxiliary variable \(y_{e, p}\)  to simplify writing some constraints and reduce the number of nonzeros of the model. Additionally, we include \(h_{e}\)  to attain the ordinal value of the period assigned to each event. We use “lazy” notation and do not include the most obvious conditions in sums. For example, since we only define \(x_{e, p, r}\)  for \(p\in \mathcal {P}_{e}\wedge r \in {\mathcal {R}}_{e}\), a sum over events for \(x_{e, p, r}\)  should be written as

$$\begin{aligned} \sum _{\begin{array}{c} e\in \mathcal {E}\\ : \ r \in {\mathcal {R}}_{e}\wedge \ p\in \mathcal {P}_{e} \end{array}} x_{e, p, r}\end{aligned}$$

but we simply write

$$\begin{aligned} \sum _{e\in \mathcal {E}} x_{e, p, r}\end{aligned}$$

Constraints (1)–(7) ensure that we satisfy all hard constraints. Constraint (1) assigns each event to an available period and room, satisfying H1 and H5. Constraints (2) and (3) enforce H2 by ensuring that at most one event can use a single room in any period and that composite rooms can only be used if none of its overlapping rooms are simultaneously used. In constraint (3), M equals the number of elements in the overlapping rooms sum. We enforce precedence relationships (H4) using (4) and avoid hard conflicts (H3) using (5), where M equals the number of elements in the sum. Constraints (6) and (7) set the values of \(y_{e, p}\)and \(h_{e}\), respectively.

$$\begin{aligned}&\sum _{p \in \mathcal {P}_e} \sum _{r \in \mathcal {R}_{\varepsilon }} x_{e, p, r} \quad =1 \quad \forall e \in \mathcal {E} \end{aligned}$$
(1)
$$\begin{aligned}&\sum _{e \in \mathcal {E}} x_{e, p, r} \quad \le 1 \quad \forall r \in \mathcal {R}, p \in \mathcal {P} \end{aligned}$$
(2)
$$\begin{aligned}&M \sum _{e \in \mathcal {E}} x_{e, p, r^c}+\sum _{r^o \in \mathcal {R}_{r^{c}}^o} \sum _{e \in \mathcal {E}} x_{e, p, r^o} \le M \quad \forall r^c \in \mathcal {R}^c, p \in \mathcal {P} \end{aligned}$$
(3)
$$\begin{aligned}&h_{e_1}-h_{e_2} \quad \le -1 \quad \forall \left( e_1, e_2\right) \in F \end{aligned}$$
(4)
$$\begin{aligned}&M \cdot y_{e, p}+\sum _{e_2 \in H C_e} y_{e_2, p} \quad \le M \quad \forall e \in \mathcal {E}, p \in \mathcal {P}_e \end{aligned}$$
(5)
$$\begin{aligned}&y_{e, p}-\sum _{r \in \mathcal {R}_e} x_{e, p, r} \quad =0 \quad \forall e \in \mathcal {E}, p \in \mathcal {P}_e \end{aligned}$$
(6)
$$\begin{aligned}&\sum _{p \in \mathcal {P}_e} O(p) \cdot y_{e, p} \quad =h_e \quad \forall e \in \mathcal {E} \end{aligned}$$
(7)

We now move to the soft constraints, starting with soft conflicts (S1), which involve primary/secondary (PS) and secondary/secondary (SS) curriculum connections. For each event \(e\in \mathcal {E}\) , we define a set of PS and SS conflicting events as \(SC_{e}^{PS}\)  and \(SC_{e}^{SS}\), respectively. We also introduce integer variables \(s_{e,p}^{PS}\)  and \(s_{e,p}^{SS}\)  to, respectively, “count” the number of PS and SS soft conflicts for event e  in period \(p\). We set the variable values using constraints (8) and (9). For these constraints, we get the values of big-M as the number of elements in the sum, e.g., \(M_{e,p}^{PS}= \left| \left\{ e_2\in SC_{e}^{PS}: O (e) < O (e_2) \wedge p\in \mathcal {P}_{e_2}\right\} \right| \). The \(O (e) < O (e_2)\) condition ensures that a conflict is only counted once. Variables \(s_{e,p}^{PS}\)  and \(s_{e,p}^{SS}\)  are bounded by 0 and \(M_{e,p}^{PS}\)  and \(M_{e,p}^{SS}\), respectively.

$$\begin{aligned}&M_{e, p}^{P S} \cdot y_{e, p}+\sum _{\begin{array}{c} e_2 \in S C_e^{P S} \\ : O(e)<O\left( e_2\right) \end{array}} y_{e_2, p} \le s_{e, p}^{P S}+M_{e, p}^{P S} \quad \forall e \in \mathcal {E}, p \in \mathcal {P}_e \end{aligned}$$
(8)
$$\begin{aligned}&M_{e, p}^{S S} \cdot y_{e, p}+\sum _{\begin{array}{c} e_2 \in S C_e^{S S} \\ : O(e)<O\left( e_2\right) \end{array}} y_{e_{2}, p} \le s_{e, p}^{S S}+M_{e, p}^{S S} \quad \forall e \in \mathcal {E}, p \in \mathcal {P}_e \end{aligned}$$
(9)

To penalize violations of soft conflicts, we add the following terms to the objective function, where \(\beta ^{PS}\)  and \(\beta ^{SS}\)  denote the soft conflict costs for PS and SS event pairs, respectively.

$$\begin{aligned} {Cost}^{{\text {S}} 1}=\beta ^{P S} \sum _{e \in \mathcal {E}} \sum _{p \in \mathcal {P}_{e}} s_{e, p}^{P S}+\beta ^{S S} \sum _{e \in \mathcal {E}} \sum _{p \in \mathcal {P}_{e}} s_{e, p}^{S S} \end{aligned}$$

The second soft constraint (S2) is event period and event room preferences. We define nonnegative costs \(\alpha _{ep}\)  and \(\alpha _{er}\)  and add the following terms to the objective function.

$$\begin{aligned} {Cost}^{\textrm{S} 2}=\sum _{e \in \mathcal {E}} \sum _{p \in \mathcal {P}_{e}} \alpha _{e p} y_{e, p}+\sum _{e \in \mathcal {E}} \sum _{p \in \mathcal {P}_e} \sum _{r \in \mathcal {R}_e} \alpha _{e r} x_{e, p, r} \end{aligned}$$

As the third soft constraint (S3), the problem includes preferred distances between event pairs. For these event pairs, we are given a minimum/maximum distance parameter \(P_{e_1,e_2}^{min}\)  and \(P_{e_1,e_2}^{max}\), respectively. First, we show how we “measure” the distance between event pairs before showing how we penalize violations. Let \(DP^{\leftarrow }\) and \(DP^{\leftrightarrow }\) be the sets of event pairs with a soft distance (min. or max.) constraint with directed and undirected distances, respectively. An event pair \(\left( e_1,e_2\right) \) has a directed distance requirement when we have a hard precedence constraint (H4) that guarantees that \(e_1\) precedes \(e_2\).

Table 4 Decision variables included for the S3 constraints

In order to include these soft constraints, we introduce the variables shown in Table 4. First, we define the integer variable \(d_{e_1,e_2}\)  to take the absolute value of the distance between \(e_1\) and \(e_2\). Notice that the maximum distance possible between any two events is always bounded by \(|\mathcal {P}|\) . When event pairs have a directed distance requirement, we set \(d_{e_1,e_2}\)  correctly using constraint (10).

$$\begin{aligned} d_{e_1, e_2}=h_{e_2}-h_{e_1} \quad \forall \left( e_1, e_2\right) \in D P^{\leftarrow } \end{aligned}$$
(10)

When event pairs have an undirected distance requirement, we need to use some additional auxiliary variables and constraints to get the absolute distance correctly. We introduce integer variable \(d_{e_1,e_2}^{\leftrightarrow }\)  to measure the distance between the two events. We set this variable using constraint (11), and the variable may then take a negative value. We need \(d_{e_1,e_2}\,\,= |d_{e_1,e_2}^{\leftrightarrow }|\), which we linearize in the following.

We introduce binary variable \(g_{e_1,e_2}^{\leftrightarrow }\)  to be 1 if \(d_{e_1,e_2}^{\leftrightarrow }\)  is positive and 0 otherwise. We set \(g_{e_1,e_2}^{\leftrightarrow }\)  using constraints (12) and (13). Then we introduce two variables \(d_{e_1,e_2}^{abs_1}\)  and \(d_{e_1,e_2}^{abs_2}\)  where exactly one takes the absolute value of \(d_{e_1,e_2}^{\leftrightarrow }\)  and the other takes the value of zero. We set the values of these two variables using (14)–(18). For simplicity in writing the model, we set the value of \(d_{e_1,e_2}\)  using constraint (19). In practice, we (of course) simply use \(d_{e_1,e_2}^{abs_1}+ d_{e_1,e_2}^{abs_2}\) in place of \(d_{e_1,e_2}\)  .

$$\begin{aligned}&d_{e_1, e_2}^{\leftrightarrow }=h_{e_2}-h_{e_1}&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(11)
$$\begin{aligned}&d_{e_1, e_2}^{\leftrightarrow } \le |\mathcal {P}| \cdot g_{e_1, e_2}^{\leftrightarrow }&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(12)
$$\begin{aligned}&d_{e_1, e_2}^{\leftrightarrow } \ge -|\mathcal {P}|\left( 1-g_{e_1, e_2}^{\leftrightarrow }\right)&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(13)
$$\begin{aligned}&d_{e_1, e_2}^{a b s_1} \le |\mathcal {P}| g_{e_1, e_2}^{\leftrightarrow }&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(14)
$$\begin{aligned}&d_{e_1, e_2}^{a b s_1} \ge -|\mathcal {P}| g_{e_1, e_2}^{\leftrightarrow }&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(15)
$$\begin{aligned}&d_{e_1, e_2}^{a b s_1} \le d_{e_1, e_2}^{\leftrightarrow }+|\mathcal {P}|\left( 1-g_{e_1, e_2}^{\leftrightarrow }\right)&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(16)
$$\begin{aligned}&d_{e_1, e_2}^{a b s_1} \ge d_{e_1, e_2}^{\leftrightarrow }-|\mathcal {P}|\left( 1-g_{e_1, e_2}^{\leftrightarrow }\right)&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(17)
$$\begin{aligned}&d_{e_1, e_2}^{a b s_2}=d_{e_1, e_2}^{a b s_1}-d_{e_1, e_2}^{\leftrightarrow }&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(18)
$$\begin{aligned}&d_{e_1, e_2}=d_{e_1, e_2}^{a b s_1}+d_{e_1, e_2}^{a b s_2}&\forall \left( e_1, e_2\right) \in D P^{\leftrightarrow } \end{aligned}$$
(19)

Table 5 shows an overview of when min. and max. distance soft constraints exist between two events and if the distance is directed. Additionally, the table shows the variable used to capture each specific distance violation, the associated cost parameter, and the event pair sets for each category. WO is shorthand for written and oral. Additionally, PP and PS indicate primary–primary and primary–secondary relations, respectively. Thus, for example, the last line of the table relates to events that are associated two courses such that one is a primary and the other a secondary course of the same curriculum. To get the minimum and maximum distance violation, we, respectively, use constraints with the following structure: \(p_{e_1,e_2} + d_{e_1,e_2}\,\,\ge P_{e_1,e_2}^{min}\), and \(d_{e_1,e_2}\,\,- P_{e_1,e_2} \le P_{e_1,e_2}^{max}\). Thus, constraints (20)–(24) correctly set the \(p_{e_1,e_2}\) variables in the order they are shown in Table 5.

Table 5 Overview of soft constraints related to minimum and maximum distance (S3)
$$\begin{aligned}&p_{e_1, e_2}^{\min E}+d_{e_1, e_2} \ge P_{e_1, e_2}^{\min } \quad \forall \left( e_1, e_2\right) \in D P^E \end{aligned}$$
(20)
$$\begin{aligned}&p_{e_1, e_2}^{\min W O}+d_{e_1, e_2} \ge P_{e_1, e_2}^{\min } \quad \forall \left( e_1, e_2\right) \in D P^{W O} \end{aligned}$$
(21)
$$\begin{aligned}&d_{e_1, e_2}-p_{e_1, e_2}^{\max W O} \le P_{e_1, e_2}^{\max } \quad \forall \left( e_1, e_2\right) \in D P^{W O} \end{aligned}$$
(22)
$$\begin{aligned}&p_{e_1, e_2}^{\min PP}+d_{e_1, e_2} \ge P_{e_1, e_2}^{\min } \quad \forall \left( e_1, e_2\right) \in D P^{P P} \end{aligned}$$
(23)
$$\begin{aligned}&p_{e_1, e_2}^{\min P S}+d_{e_1, e_2} \ge P_{e_1, e_2}^{\min } \quad \forall \left( e_1, e_2\right) \in D P^{P S} \end{aligned}$$
(24)

We then extend the objective function with

$$\begin{aligned} \textrm{Cost}^{\textrm{S} 3}= & {} \gamma ^E \sum _{\left( e_1, e_2\right) \in D P^E} p_{e_1, e_2}^{\min E}\\{} & {} +\,\gamma ^{W O} \sum _{\left( e_1, e_2\right) \in D P^{W O}}\left( p_{e_1, e_2}^{{\text {minWO}}}+p_{e_1, e_2}^{{\text {maxWO}}}\right) \\{} & {} +\,\gamma ^{P P} \sum _{\left( e_1, e_2\right) \in D P^{P P}} p_{e_1, e_2}^{\min P P}\\{} & {} +\,\gamma ^{P S} \sum _{\left( e_1, e_2\right) \in D P^{P S}} p_{e_1, e_2}^{\min P S} \end{aligned}$$

Finally, we note that we can relax the soft constraint penalty counting variables (\(s_{e,p}^{PS}\), \(s_{e,p}^{SS}\), \(p_{e_1,e_2}^{minE}\), \(p_{e_1,e_2}^{minWO}\), \(p_{e_1,e_2}^{maxWO}\), \(p_{e_1,e_2}^{minPP}\), and \(p_{e_1,e_2}^{minPS}\)) to be continuous, as they naturally will attain integer values using the given constraints. Thus, we can express the model either as an IP or a MIP.

4.1.2 Two-stage decomposition

A typical approach for timetabling problems, especially for IP-based methods, is constructing the timetable in two stages (see, e.g., Daskalaki and Birbas, 2005; Kristiansen et al., 2015; Al-Yakoob and Sherali, 2015). One successful decomposition strategy involves first assigning events to periods and afterward to rooms (Lach & Lübbecke, 2008, 2012; Sørensen & Dahms, 2014). For this decomposition scheme to be effective, we must ensure that the solution found in the first stage is feasible or that we can easily fix any infeasibility in the second stage. In this problem, the only infeasibility that could arise between the two stages is not observing the room double-booking constraints (H2). However, rooms are generally in excess for the available data, especially since events can use rooms larger than they requested. Additionally, we have no event room forbidding (hard) constraints in the current data, and thus, with regard to feasibility, events are indifferent to specific rooms, and rooms can be treated as generic rooms of a specific size.

Here we discuss how we implement such a decomposition for this problem, which we designated MIP \(_\textrm{2S}\). In the first stage, \(y_{e, p}\)becomes the primary assignment variable, and we leave out all room-related aspects, e.g., the \(x_{e, p, r}\)variables, room double-booking constraints, and room preference objectives. We add these parts to the model in the second stage. We use the first stage solution as a warm start and do not fix any assignments, allowing the MIP solver to change period assignments freely, and since the second stage model is the full model, we gain valid lower bounds. In the following, we define constraints to significantly reduce the risk of overbooking a period in the first stage regarding room capacity in the second stage.

Let \(\mathcal {T}\)  be the set of single room sizes, i.e., \(\mathcal {T}= \{S, M, L\}\) where \(S\), \(M\), \(L\)  denote small, medium, and large, respectively. Composite rooms consist of multiple rooms of the same size, and we can therefore consider the combination of room size and number of rooms, which we call a “room-type.” Let \({\mathcal {N}_{t}}\)  be a set of “number of rooms” for a given room size \(t\). For example, we have \({\mathcal {N}}_{S}= \{1,2,{3}\}\) if the instance data includes single small rooms and composite rooms consisting of two, and three small rooms. Then \((n, t) = (2, S)\) denotes the room-type consisting of two small rooms.

Fig. 1
figure 1

Composite room conflict graph for the given example

Let \(\mathcal {E}_{n,t}^{RT}\)  be the set of events that require a room of room-type (nt) and \(c_{n,t,p}\)  the capacity of room-type (nt) in period \(p\). For single rooms (\(n= 1\)), we set the capacity to the number of single rooms of size \(t\), or larger, available in period \(p\). For composite rooms, we need to handle composite room member conflicts. We consider an example with five single small rooms \(\left( \left\{ A, B, C, D, E\right\} \right) \) and three (2, S) \(\left( \left\{ AB, BC, CD\right\} \right) \) and two (3, S) \(\left( \left\{ ABC, CDE\right\} \right) \) composite rooms. Figure 1 shows the composite room conflict graphs for the example. By inspection, we see that at most two (2, S) and one (3, S) can be used at a time. These limits correspond to each graph’s independence number. However, due to room unavailabilities, the graph’s nodes may depend on the period \(p\). Thus, we get the capacity \(\left( c_{n,t,p}\right) \) for composite room-types as the independence number of the conflict graph in period \(p\).

Examinations requesting a single room can use rooms of that size or larger, e.g., an examination requesting a small room can use small, medium, and large rooms. Examinations requesting a composite room can only use composite rooms that meet their exact specification. Constraint (25) ensures that we observe single room occupation and (26) specifically limits the number of events assigned in a single period that require composite rooms.

$$\begin{aligned}&\sum _{\begin{array}{c} t^{\prime }\in \mathcal {T} \\ :t^{\prime } \ge t \end{array}} \sum _{n \in \mathcal {N}_t} n \sum _{e \in \mathcal {E}_{n, t}^{R T}} y_{e, p} \le \sum _{\begin{array}{c} t^{\prime } \in \mathcal {T} \\ :t^{\prime } \ge t \end{array}} C_{t, 1, p} \quad \forall t \in \mathcal {T}, p \in \mathcal {P} \end{aligned}$$
(25)
$$\begin{aligned}&\sum _{e \in \mathcal {E}_{n, t}^{R T}} y_{e, p} \quad \le C_{n, t, p} \quad \forall t \in \mathcal {T}, n \in \mathcal {N}_t: n>1, p \in \mathcal {P} \end{aligned}$$
(26)

These constraints do not consider composite room conflicts for composite rooms with a different number of member rooms. However, in most cases, they are sufficient as few instances have composite rooms with three and four member rooms, and fewer still with the possibility of conflicts between them. We have not experienced a solution from the first stage being infeasible in the second stage throughout our testing. By far, it is most important to forbid double-booking between single and composite rooms with two member rooms, as they are most common. Constraints (25) and (26) handle this fully.

4.2 Constraint programming

In this section, we define CP, a constraint programming (CP) model for the problem, encoded in the MiniZinc modeling language (see Nethercote et al., 2007) and solved with the Gecode backend (see Gecode Team, 2019).

Although CP is an exact method, we actually use a heuristic search method, because exhaustive search does not work well for this problem. The heuristic search uses restarts and large neighborhood search (LNS, see Shaw, 1998, Dekker et al., 2018) on top of a branch-and-bound scheme. This is expressed by search annotations, which are passed to Gecode for execution. It is worth noting that this is supported by MiniZinc straight out of the box.

As noted earlier, events are indifferent to specific rooms, and rooms can be treated as generic rooms of a specific size. We thus split \({\mathcal {R}\,}\) into a set \({\mathcal {K}}\) of equivalence classes. Then, we solve the problem in terms of equivalence classes instead of specific rooms, relaxing constraint H2 to allow multiple events per class simultaneously up to the capacity of the class. Finally, we trivially transform the obtained solution back into one expressed in terms of specific rooms. We now present the MiniZinc model step by step.

The parameters Events, Periods, Rooms, and CRooms correspond to \({|\mathcal {E}|}\), \({|\mathcal {P}|}\), \({|\mathcal {R}\,|}\), and \({|\mathcal {K}|}\), respectively. The decision variables are as follows, where \(\texttt {CRoomPeriod}\)\(\texttt {Index}(k,p)\) is a bijective function of the room class k and period p. Also, \(\texttt {CRoomPeriodIndex}(k,p)\) is monotonically increasing in the absolute distance between p and the middle period. In other words, the farther away p is from the middle period, the larger the function value is. \(\texttt {EventCRP}[e]\) is an auxiliary variable for the (period, room class) combination of event e:

figure a

The first constraints encode constraints H1 and H5:

figure b

By eliminating up front infeasible values from the domain of EventCRP, the next constraint contributes to encoding constraint H5:

figure c

The next constraint encodes constraint H2:

  • The quantity \(\texttt {count}[k,p]\) counts the number of events for given equivalence class k and period p by means of a global_cardinality_closed constraint (see Stuckey et al., 2020), and its value can be at most the cardinality of given equivalence class k.

  • A composite room cannot be used simultaneously with an overlapping room.

figure d

The next constraint encodes the hard conflicts rule (constraint H3) using an alldifferent constraint (see Stuckey et al., 2020), where each \(c \in \texttt {AllCliques}\) is a maximal clique of events that are in hard conflict with each other. The cliques are computed by the Python function networkx.find_cliques, which is based on the algorithm published by Bron and Kerbosch (1973) in a preprocessing step, which runs in negligible time compared to the solving time.

figure e

The next constraint enforces precedences (constraint H4):

figure f

The next constraint defines the ConflictDistanceCost term of the objective function (constraints S1 and S3). We have transformed the raw input data, which was given as several large arrays and several cost terms for many pairs of events, into two arrays CostKeys and CostVectors that aggregate these data into at most one cost term per event pair. This transformation was done in the same spirit as tabling (see Dekker et al., 2017). Let i be a row of CostKeys containing a pair \((e_1,e_2)\) of events and let \(\delta \) be their temporal distance. Then \(\texttt {CostVectors}[i,\delta ]\) is the incurred cost for that pair:

figure g

The last constraints define the remaining terms of the objective function (constraint S2):

figure h

Finally, the search and minimization statement is as follows, where SortedEvents is the sequence of events in some heuristic order, and the four \(*\)Cost expressions are the terms of the objective function:

figure i

The search annotations are:

  • restart_luby(100)—the kth restart is given a node limit of \(100 \cdot L_k\), where L is the Luby restart sequence (see Luby et al., 1993).

  • relax_and_reconstruct(CRPE, 97)—at every restart after finding the first solution, an LNS step is performed where 3% randomly selected decision variables are set free and the remaining 97% keep their current values.

  • int_search(CRPE, dom_w_deg, indomain_split)—this annotation determines the order in which variables and their values are explored. The next variable to explore is the EventCRP variable with the smallest current domain size divided by weighted degree, breaking ties by the SortedEvents order. For the designated variable, explore the lower half of its domain first, splitting domains until one value has been singled out. By the monotonicity property of CRoomPeriodIndex, this has the effect of attempting to place events close to the middle period before attempting to place them farther away from the middle.

The choice of Luby as opposed to other restart schemes and of the parameter values was made after preliminary experimentation.

4.3 Mixed integer programming with MiniZinc

In this section, we define MIP \(^\textrm{MZ}\), another MIP model for the problem. It is solved with the Gurobi backend (see Gurobi Optimization, LLC, 2021). MiniZinc has rich support for binarizing multiple-value domains and linearizing constraints to make models suitable for execution by MIP solvers. We have not relied much on that support; instead, our model uses 0–1 variables and linear constraints almost exclusively. In fact, MIP \(^\textrm{MZ}\) was derived from the CP model by manual binarization and linearization.

The key difference between CP and MIP \(^\textrm{MZ}\) are in the decision variables, which are all 0–1, as follows. EventPeriodCRoom is the main decision variable, which relates events, periods, and room classes. The CP counterparts are EventPeriod and EventCRoom.

figure j

Another decision variable CostDistance facilitates the part of the objective function that depends on the temporal distance between events. Let i be a row of CostKeys containing a pair \((e_1,e_2)\) of events and let \(\delta \) be their temporal distance. Then \(\texttt {CostDistance}[i,d] = 1\) if \(d = \delta \) and 0 otherwise:

figure k

The MIP \(^\textrm{MZ}\) constraints consist of a straightforward linearization and adaptation of the CP constraints to the MIP \(^\textrm{MZ}\) variables.

4.4 Simulated annealing

The simulated annealing (SA) approach is an extension of the one proposed by Battistutta et al. (2020). In detail, to represent a state in the search space we use two vectors that store the period and the room of each event, respectively. Only periods in \(\mathcal {P}_{e}\) and rooms in \(\mathcal {R}_{e}\) can be assigned to event e, thus explicitly enforcing constraints H1 and H5. The other constraints, namely H2 (RoomOccupation), H3 (HardConflicts), and H4 (Precedences), can be violated and are included in the cost function with a high weight. The cost function is thus a linear combination of the soft constraints S1S3 and a measure of the distance to feasibility, corresponding to the degree of violation of constraints H2, H3, and H4.

The initial solution is generated totally at random, except that it always satisfies constraints H1 and H5. This is obtained simply by drawing randomly period and room assignments for event e from \(\mathcal {P}_{e}\) and \(\mathcal {R}_{e}\), respectively.

Regarding the neighborhood relation, Battistutta et al. (2020) employed the one, called MEE (MoveEventOrExam), that moves either a single event (with probability \(1-p_b\)) or the two events associated with a composite examination jointly (with probability \(p_b\)).

We also use MEE, but in combination with a new neighborhood, called SE (SwapEvents), which swaps period and room of two single events. Only events that do not belong to a composite examination are included in the SE neighborhood.

At each iteration a random move is drawn from the neighborhood MEE \(\cup \) SE. The move selection is biased on the basis of a parameter \(p_s\) (called swap rate), so that a SE move is selected with probability \(p_s\), and a MEE move with probability \(1-p_s\). The parameters \(p_b\) and \(p_s\) are fixed experimentally, according to the tuning procedure. The drawing of the specific move inside the selected neighborhood is made according to a uniform distribution.

Table 6 Instance features

As customary for SA, we use the Metropolis acceptance criterion: A move is always accepted if it is improving or sideways (i.e., same cost), whereas it is accepted based on a time-decreasing exponential distribution \(e^{-\varDelta /T}\) in case it is worsening, where \(\varDelta \) is the difference of total cost induced by the move, and T is the temperature.

The temperature starts at the initial value \(T_0\) and is evolves according to the standard geometric cooling scheme of SA, with the cutoff mechanism. That is, it is decreased during the search by multiplying it by a value \(\alpha \) (with \(0< \alpha < 1\)) after a fixed number of samples \(N_s\) have been drawn, or a fixed number of moves \(N_a\) have been accepted.

For the tuning procedure, we decided to use as stop criterion the total number of iterations \(\mathcal{I}\), in order to keep the running time approximately equal for all configurations of the parameters. In our experiments, we fixed \(\mathcal{I} = 10^8\), corresponding to an average time of approximately 660 s per run.

The tuning procedure has been performed using the tool json2run (Urli, 2013), which samples the configurations using the Hammersley point set (Hammersley & Handscomb, 1964) and implements the F-Race procedure (Birattari et al., 2010) for comparing them.

The winning configuration turned out to be: \(T_0 = 188.89\), \(\alpha = 0.875\), \(N_s = 2092772\), \(N_a = 292988\), \(p_b = 0.967\), and \(p_s = 0.288\).

For the experiments in the comparison with the other techniques, in order to have the same fixed running time on all instances, we use an additional stop criterion based on total running time.

5 Experimental analysis

We first introduce the dataset employed in the analysis (Sect. 5.1), then we show the settings used for the comparison (Sect. 5.2), and finally, we report and discuss the experimental results.

5.1 Problem instances

The dataset is composed of real-world instances extracted from various Italian universities and collected by Battistutta et al. (2020). They are written in JSON file format but are also made available in dzn format, thanks to a preprocessing procedure that splits courses into single events and distributes constraints accordingly.

In reference to the extraction and preprocessing procedures, we have corrected a few minor issues in the code of Battistutta et al. (2020) so that the instances that we use here are actually slightly revised with respect to the ones used by Battistutta et al. (2020).

The repository https://bitbucket.org/satt/examtimetablinguniuddata contains both the original instances and the revised ones. In addition, the it also contains a software toolbox, written in Python, that allows us to check the validity of instances and solutions. The software has been written independently from the solvers so to be used as a sort of third-party solution checker in the spirit of the methodology outlined in (Bonutti et al., 2012). Besides allowing the debugging of the different solution approaches presented in this paper, this toolbox will provide against misinterpretations of the different constraints in case of future works on the same problem by different researchers, thus enabling the comparability of results. In addition, the software provides a format translator between the original (and richer) JSON format to the event-based dzn format and a tool for computing a set of features from both the instances and the solutions.

Related to this last functionality, Table 6 shows the main features of the revised instances in terms of the total number of courses, events, periods, single rooms and composite rooms, and the number of timeslots (i.e., periods in a day). The name of each instance follows this pattern: Dx-y-z, where x is the department identifier, y is the examination season, and z is the academic year. It can be noticed that instances of the same department are quite homogeneous, except for D3, D4, and D5, where the number of events and periods changes during the year depending on the season.

5.2 Setting and tuning

We run all experimental tests in a high-performance computing cluster on 64bit computers running Scientific Linux 7.7. We use computers equipped with 756GB RAM and two Intel Xeon Gold 6226R CPUs clocked at 2.90GHz. We run the MiniZinc model using MiniZinc version 2.5.5 (see Nethercote et al., 2007) and Gecode version 6.3.0 (see Gecode Team, 2019). To solve MIP models, we use Gurobi 9.1.0 (see Gurobi Optimization, LLC, 2021). The simulated annealing procedure has been implemented in C++ and compiled using GNU g++ (v. 9.2.0).

For all tests, except for lower bounds (LB), we have ten runs using a single thread for an hour. To get lower bounds, we have five runs using four threads and a runtime of 24 h. Table 7 shows the parameter settings used for each test. The Gurobi Presolve parameter controls the level of presolve, and a value of 2 indicates “aggressive” and 0 turns it off. The MIPFocus modifies Gurobi’s high-level search strategy, such that a value of 1 instructs Gurobi to focus on finding solutions and 3 to focus on improving the bound.

Through testing we found that when running MIP (the full model discussed in Sect. 4.1), we get better results when using the MIP variant of the model as opposed to the IP variant. The MIP \(_\textrm{2S}\) gets better results when using the IP. Additionally, testing revealed that the best time distribution for MIP \(_\textrm{2S}\) was to run the first stage for 99% of the available time, leaving just 36 s for the second stage. It is generally easy for the solver to add rooms in the second stage warm start without incurring in any room preference penalties. Thus time is better spent in the first stage. We skip presolving as there is no benefit given such a short time limit, and it is better to try and eliminate any introduced room preference penalties. Finally, we use the IP model variant for getting lower bounds, as this model generally finds better bounds.

When running MIP \(^\textrm{MZ}\), we leave the presolve parameter to the default value, as an aggressive presolve strategy on this model leads to a very long presolving phase on most instances, resulting in worse performance.

For SA, we used the winning configuration of parameters shown in Sect. 4.4, except for \(N_s\) and \(N_a\), which have been increased in order to have one hour running time.

Table 7 Parameters used for testing

5.3 Comparative results

Table 8 reports the average results of ten runs of each technique with a timeout of one hour. The second column represents the results of the code by Battistutta et al. (2020) (with some minor corrections), which we rerun on this test computer with the same timeout. The column Best reports the best-known results, obtained using all performed experiments.

First, we notice that SA improves upon the previous version on all instances but two, although the gap is relatively small. The results of CP are somewhat inferior, with some cases in which they are particularly poor. The situation is more extreme for the other MiniZinc-based model MIP \(^\textrm{MZ}\), which has some good results, but in other cases, they are extremely bad or even with no solution returned within the given timeout. A more stable behavior could perhaps be obtained by using a lazy clause generation solver, which Gecode is not (see Ohrimenko et al., 2009), but unfortunately, at this time, none is available that also has support for LNS.

Table 8 Comparative results and lower bounds

Regarding the MIP and MIP \(_\textrm{2S}\) methods, unsurprisingly, the two-stage method MIP \(_\textrm{2S}\) has (generally) better performance than the single-stage one MIP, as, like CP, it makes use of a preprocessing step that removes one dimension of the problem (i.e., the rooms), which has little effect on solution value. The fact that MIP \(_\textrm{2S}\) works better than MIP is because, in our instances, rooms are not a critical resource. Indeed, in most current cases, room endowment is sized for the course lectures, which normally require more space than the examinations, so it turned out oversized for the examinations. Nonetheless we might expect future cases in which rooms are more critical.

Finally, we notice that MIP \(^\textrm{MZ}\) and MIP \(_\textrm{2S}\) provide more robust results than SA on two instances, namely D4-1-17 and D6-3-17, consistently obtaining the best values (276 and 30, respectively).

Regarding the lower bounds, we see that, besides the cases in which there is a perfect solution (cost 0), they are quite tight (below 20%) only in four cases. In particular, in one case the lower bound is equal to the best solution, thus proving its optimality.

6 Conclusions and future work

We have investigated three independent optimization paradigms, namely CP, MIP, and SA, for the real-world examination timetabling problem proposed by Battistutta et al. (2020).

For MIP, we have developed three different versions (MIP \(^\textrm{MZ}\), MIP, and MIP \(_\textrm{2S}\)), thus resulting in five total alternative search methods that we compared among themselves and with the original SA of Battistutta et al. (2020). Even though the techniques have been developed independently (and by different authors), they have been run on the same machine with the same timeout, to have a fair competition ground. In addition, we also computed some valid lower bounds.

Unsurprisingly, the metaheuristic approach, properly tuned, turned out to work better on the majority of instances, but the exact techniques are close and actually better on a few instances.

All instances and solutions are available for inspection and future comparison, along with the solution checker that reports the value of the objective function.

In the future, we will investigate the correlation of the results of the different search methods with the features of the specific instances. To this aim, we plan to collect new real-world instances and possibly develop a principled instance generator to study such correlation on a potentially unlimited number of instances.

In addition, we plan to develop hybrid techniques that could benefit from the respective advantages of the different search methods. These hybrid techniques would range from simple ones using SA for warm-starting MIP and CP, to more complex interaction mechanisms such as Benders (1962) decomposition or the matheuristic paradigm (see Maniezzo et al., 2021).