1 Introduction

The Consultation Timetabling Problem (CTP) is a recurrent planning problem for the high schools in Denmark, which concerns the creation of a schedule for student-teacher meetings, given the students requests of teachers, subject to various soft constraints and resource constraints. The problem has not been described in the literature before, but it shares some properties with other problems within the educational sector, see Sect. 2.3. There exists several variants of the problem. In this paper we consider the two most important versions for the Danish high schools, namely the Parental Consultation Timetabling Problem (PCTP) and the Supervisor Consultation Timetabling Problem (SCTP).

This paper is written in cooperation with the Danish company MaCom A/S. MaCom A/S is the developer of the cloud-based high school administration system Lectio, which handles all sorts of administrative tasks for the high schools, including a GUI and a heuristic-based solver for the PCTP. Through this cooperation we have access to real-life data for approximately 95 % of all Danish high schools, which constitutes thousands of datasets. We will provide computational results for 300 of these datasets, which is a very big amount of real-life data compared to the majority of scheduling literature.

Our task of this paper is to give a detailed description of the CTP, and model it using Integer Programming. This model should support both the PCTP and the SCTP. To find solutions, both a state-of-the-art MIP solver and an Adaptive Large Neighborhood Search (ALNS) heuristic is attempted. These solution approaches are compared to the existing heuristic used in Lectio, and the best approach is made available for all users of Lectio.

2 Consultation timetabling problem

In the following we describe the CTP in details, starting with specifications of the two versions of the problem.

2.1 Parental consultation timetabling problem

Once or twice a year the high schools invite the students and their parents to participate in meetings with the teachers of the student. The goal of these meetings is to allow the teachers to inform of the educational progress of the student, and possibly address relevant problems. Parental consultations usually take place in the evening of a normal work day, and each meeting generally has a duration between 5 and 15 min. The order of events for parental consultations is the following: The high school administration selects days where the meetings should take place, and for each day a feasible time-interval is chosen. Each student (usually in collaboration with his parents) makes prioritized requests of groups of teachers he would like to meet. Few of these teacher groups contain more than a single teacher, because the student is taught by only one teacher in most classes. Usually the high school also allows the students to request specific time intervals, within the overall time interval on each day, where the student will be available for meeting teachers. Given the student’s choice of teachers and time intervals, it is then up to the high school administration to decide which teachers a student should meet, and when the meetings should take place.

2.2 Supervisor consultation timetabling problem

In the last year of a high school education, the students are required to make a large study project (Danish: Studieretnings Projekt). Each student selects two subjects/courses as combined subject for his project, e.g. English and History. Each student is then assigned two teachers whom will be his supervisors for the project. The objective of the SCTP is to plan meetings between the students and their respective supervisors. The goal of these meetings is for the supervisors to provide the student with some useful hints for problem definition, literature research, etc. Typically supervisor consultations take place in the daytime, where both the student and the corresponding teachers are located at the high school.

From a timetabling point of view, these two types of consultations are almost identical. For both types, as many as possible of the meeting requests should be fulfilled, and both essentially contain the same constraints. Therefore we will in this paper model both types of consultations using the same Integer Programming model. In the remainder of this paper we refrain, as much as possible, from distinguishing between the two variants of the problem, and will by CTP denote the problem which is the superset of the PCTP and the SCTP.

In the following further details of the soft constraints of the CTP is given. These soft constraints define various scheduling preferences for the students and teachers.

A contiguous streak of meetings for a teacher or student are from now on denoted a sequence. A time slot is void for a teacher or student if the time slot is empty and no meetings are scheduled in either earlier time slots or in later time slots. A break for a student or teacher is defined as a time slot which is not void, and which has no meetings assigned. Void time slots must be distinguished from breaks because they do not effect the density of a schedule. This is due to the fact that students and teachers are not obligated to stay at the school throughout the entire duration of the consultation. Given these definitions, we formulate the following soft constraints:

  • It is attempted to achieve a solution where the positions of the granted meetings for a given individual are placed in a sequence. I.e. for both students and teachers the number of breaks should be minimized. This is to achieve a schedule with as little waiting time as possible. However, for the students it is possible for the high school administration to declare whether they need a break after each meeting. This is usually selected if there exists ”traveling” time between the meeting rooms where the teachers are located. The CTP only takes consultation meetings into consideration when determining a sequence, and not other activities.

  • When assigning a meeting to a time slot, the availability of the student and teacher must be taken into consideration. The high school administration decides whether this constraint should be defined as a hard- or a soft-constraint. It is common that in case of SCTP, this is defined as a soft-constraint as it is feasible for the students to leave classes to have meetings with their supervisors. In case of the PCTP, this constraint is usually treated as a hard-constraint, as a solution should respect activities such as meetings, study-trips, etc.

  • It is undesirable for teachers to have too long sequences of meetings. Therefore a maximum on the length of sequences for teachers is given (treated as a soft-constraint). This is not required for the students, since they typically have a low number of requests.

  • For a consultation which spans over several days it is desired that a student or a teacher only have meetings in one of the days, such that they are not obligated to attend both. This is especially critical for students, as they have a low number of requests.

  • The high school administration prefers if the meetings are placed as close as possible to a specific time slot on each day. This is usually selected as the first timeslot on each day.

Figure 1 shows an example of a consultation schedule on one day. The schedule contains one void time slot, two breaks, and seven consultations.

Fig. 1
figure 1

Example of a feasible consultation schedule

2.3 Literature review

The CTP has, to the best of our knowledge, not been described in the literature before. However there exist many types of related timetabling problems within the educational system, see Schaerf (1999), Burke and Petrovic (2002), McCollum (2006), and Pillay (2010) for overviews of this field. Problems such as Course Timetabling and Student Sectioning have been looked into, e.g. Tripathy (1984), Erben and Keppler (1996), and Carter and Laporte (1998), Müller and Murray (2010). Related problems for Danish high schools include Kristiansen et al. (2011) and Kristiansen and Stidsen (2012) regarding the Elective Course Planning Problem, and Sørensen and Stidsen (2012) regarding the timetabling problem. For all these problems it applies that they attempt to assign requests to time slots in a given schedule.

The requirement for compact schedules is well known in educational timetabling. In Santos et al. (2010) the Class-Teacher Timetabling Problem with Compactness Constraints is described. The compactness is defined in terms of teacher “holes”, which is equivalent to our definition of breaks. The teacher holes are modeled with a linear IP, which entails the need for two auxiliary variables, and three additional constraints. This approach can be directly applied to the CTP. de Haan et al. (2007) specifies that the timetabling problem at Dutch high schools requires compact schedules for both classes and teachers. This is addressed using heuristic methods. For Greek high schools the situation is similar, see e.g. Birbas et al. (2009). Here the problem is handled using a MIP solver for a complicated IP.

3 Integer programming model

The following IP model for the CTP aims at maximizing the number of granted meeting requests and minimizing the violating of soft constraints, while respecting the hard constraints.

A consultation problem at a high school contains set of students \(S\), a set of teachers \(T\), a set of teacher groups \(G\), an ordered set of time slots \(B\) and a set of days \(D\). \(V_{b,d} \in \{0,1\}\) takes value 1 if time slot \(b\) is part of day \(d\), and zero otherwise.

The decision whether a student \(s\) is given a meeting with teacher group \(g\) in time slot \(b\) is defined by the binary variable \(x_{s,g,b} \in \{0,1\}\). The profit of meeting \((s,g)\) in timeslot \(b\) is given by \(\alpha _{s,g,b} \in \mathbb R ^+\). The basic objective function is therefore

$$\begin{aligned} \max \ \sum _{s,g,b} \alpha _{s,g,b} x_{s,g,b} \end{aligned}$$
(1)

3.1 Unavailabilities

In some situations it can be allowed to interrupt other activities at the high schools to satisfy a meeting request. Let \(D_{t,b} \in \{0,1\}\) take value 1 if teacher \(t\) is not available (i.e. occupied by other activities) in time slot \(b\), and zero otherwise. Let \(E_{s,b} \in \{0,1\}\) be the completely analogous parameter for the students. If a consultation meeting is placed in a time slot where either a student or a teacher has some other activities, it is penalized by the following.

$$\begin{aligned} -\sum _{s,g,b} \left( \sum _t \delta _t \cdot D_{t,b} \cdot P_{g,t} + \delta _s \cdot E_{s,b}\right) x_{s,g,b} \end{aligned}$$
(2)

If it is not allowed to interrupt activities this term is not added to the objective, and these constraints take the form of hard-constraints by forbidding meetings of teacher \(t\) in time slot \(b\) if \(D_{t,b}=1\), and likewise for student \(s\) in time slot \(b\) if \(E_{s,b}=1\).

It is of course not allowed to assign a student to a consultation meeting with a teacher group, if the student has not requested this teacher group. And is it not allowed for the student or teacher to have more than one meeting in each time slot. This imposed the following constraints. The parameter \(P_{t,g} \in \{0,1\}\) takes value 1 if teacher \(t\) is in teacher group \(g\), and zero otherwise. \(R_{s,g} \in \{0,1\}\) takes value 1 if student \(s\) has requested teacher group \(g\), and zero otherwise. \(C_{s,b} \in \{0, 1\}\) takes value 1 if student \(s\) has requested time slot \(b\).

$$\begin{aligned}&\sum _b x_{s,g,b} \le R_{s,g} \qquad \forall \, s,g\end{aligned}$$
(3)
$$\begin{aligned}&\sum _g x_{s,g,b} \le C_{s,b} \qquad \forall \, s,b \end{aligned}$$
(4)
$$\begin{aligned}&\sum _{s,g} P_{g,t}\cdot x_{s,g,b} \le 1 \qquad \forall \, t,b \end{aligned}$$
(5)

3.2 Undesirable breaks

One of the undesirable properties of the CTP is the breaks for both students and teachers. Let the variables \(z_{s,d}\in \mathbb N \) and \(w_{t,d}\in \mathbb N \) be the number of breaks in day \(d\) for a student and for a teacher, respectively. As we do not penalize void time slots as shown in Fig. 1, we need to know when a individual have his first and last meeting. Let variables \(f^{\text{ first }}_{s,d}\) and \(f^{\text{ last }}_{s,d}\) denote the timeslot of the first and last meeting for student \(s\), respectively. Let \(h^{\text{ first }}_{t,d}\) and \(h^{\text{ last }}_{t,d}\) be the analogous variables for teacher \(t\). Let variable \(n_{s,b,d}\in \{0,1\}\) take value 1 if student \(s\) is placed in time slot \(b\) on day \(d\). The idle time slots for a student is then given by the following constraints

$$\begin{aligned}&\sum _{g} V_{b,d}\cdot x_{s,g,b} = n_{s,b,d}\qquad \forall \, s,b, d, V_{b,d} = 1 \end{aligned}$$
(6)
$$\begin{aligned}&f^{\text{ last }}_{s,d}- f^{\text{ first }}_{s,d}+ 1 - \sum _bn_{s,b,d}\cdot (1 + HS) + HS = z_{s,d}\qquad \forall \, s,d \end{aligned}$$
(7)
$$\begin{aligned}&|B|_d - (|B|_d - \text{ ord }(b))\cdot n_{s,b,d}\ge f^{\text{ first }}_{s,d}\qquad \forall \, s,b,d, V_{b,d} = 1 \end{aligned}$$
(8)
$$\begin{aligned}&\text{ ord }(b)\cdot n_{s,b,d}\le f^{\text{ last }}_{s,d}\qquad \forall \, s,b,d, V_{b,d} = 1 \end{aligned}$$
(9)

The parameter \(HS \in \{0,1\}\) indicates whether a student is required a break after each meeting, zero otherwise. This parameter is selectable for the user of Lectio. We want to penalize the cost such that it increases exponential on the number of breaks. The cost function is modeled as a piece-wise linear function by introducing a new variable \(vs_{s,d,j}\in \{0,1 \}\), where \(j \in 1, \ldots , m\), which takes value 1 if student \(s\) has \(j\) breaks in day \(d\). This imposes the following constraints.

$$\begin{aligned}&\sum _j vs_{s,d,j}\cdot \text{ ord }(j) = z_{s,d}\qquad \forall \, s,d \end{aligned}$$
(10)
$$\begin{aligned}&\sum _j vs_{s,d,j}= 1 \qquad \forall \, s,d \end{aligned}$$
(11)

As for the teacher let variable \(p_{t,b,d}\in \{0,1\}\) take value 1 if teacher \(t\) has a meeting in time slot \(b\) on day \(d\). The cost for teacher breaks is also made as a piece-wise linear function, using the new variable \(vt_{t,d,j}\in \{0,1 \}.\) The following constraints are imposed to denote the number of undesirable breaks for a given teacher,

$$\begin{aligned}&\sum _{g,s} V_{b,d}\cdot P_{g,t}\cdot x_{s,g,b} = p_{t,b,d}\qquad \forall \, t,b,d,V_{b,d} = 1 \end{aligned}$$
(12)
$$\begin{aligned}&|B|_d - (|B|_d - \text{ ord }(b))\cdot p_{t,b,d}\ge h^{\text{ first }}_{t,d}\qquad \forall \, t,b,d, V_{b,d} = 1 \end{aligned}$$
(13)
$$\begin{aligned}&\text{ ord }(b)\cdot p_{t,b,d}\le h^{\text{ last }}_{t,d}\qquad \forall \, t,b,d, V_{b,d} = 1 \end{aligned}$$
(14)
$$\begin{aligned}&h^{\text{ last }}_{t,d}- h^{\text{ first }}_{t,d}+ 1 - \sum _bp_{t,b,d}= w_{t,d}\qquad \forall \, t,d \end{aligned}$$
(15)
$$\begin{aligned}&\sum _j vt_{t,d,j}\cdot \text{ ord }(j) = w_{t,d}\qquad \forall \, t,d \end{aligned}$$
(16)
$$\begin{aligned}&\sum _j vt_{t,d,j}= 1 \qquad \forall \, t,d \end{aligned}$$
(17)

The contribution to the objective is as follows

$$\begin{aligned}&-\sum _{s,d,j} \gamma _{s,j} \cdot vs_{s,d,j}-\sum _{t,d,j} \beta _{t,j} \cdot vt_{t,d,j}\end{aligned}$$
(18)

3.3 Sequences

In connection with the undesirable breaks, the CTP also contains some needed breaks. For the students it is often necessary to give them a break between each consultation meeting due to traveling time between meeting rooms. This impose the following constraints.

$$\begin{aligned}&\sum _g \left( x_{s,g,b} + x_{g,b_{+1},s}\right) \le 1 \, \, \forall \, s, d, b \in B\backslash \{b_{|B_J|}\}, HS=1, V_{b,d}=V_{b_{+1},d}=1 \nonumber \\ \end{aligned}$$
(19)

The teachers seldom change location during the consultation period, so travel time is not needed. However, as mentioned it is undesirable for the teachers to have a long sequence of meetings, as they need a break now and then. The maximum size of a sequence for a teacher is denoted \(Q \in \mathbb N \). Let the variable \(y_{t,b,d} \in \{0,1\}\) take value 1 if time slot \(b\) is the start of a sequence of length greater than \(Q\) on day \(d\) for teacher \(t\). The following equation constraints this variable,

$$\begin{aligned}&\sum _{s} \sum _{\begin{array}{c} b^{\prime } = b\\ V_{b^{\prime },d}=1 \end{array}}^{b + Q} p_{t,b^{\prime },d}- y_{t,b,d} \le Q \qquad \forall \, t,d,b\in B \backslash \{b_j | j > |B| - Q\}, V_{b,d} = 1 \nonumber \\ \end{aligned}$$
(20)

The contribution to the objective is given by

$$\begin{aligned}&-\sum _{t,b,d} \omega \cdot y_{t,b,d} \end{aligned}$$
(21)

3.4 Day distribution

In case the consultation has more than one day it is preferred that each student and each teacher only has meetings on a single day. \(u_{t,d}\in \{0,1\}\) and \(u_{s,d}\in \{0,1\}\) denotes if teacher \(t\) or student \(s\) has a meeting on day \(d\), respectively. \(v_{t}\in \mathbb N \) and \(v_{s}\in \mathbb N \) denotes the number of days where teacher \(t\) and student \(s\) have meetings, respectively. The number of days with meetings is punished in the objective by

$$\begin{aligned}&- \sum _s \zeta _s \cdot v_{s}- \sum _t \zeta _t \cdot v_{t}\end{aligned}$$
(22)

and is constrained by the following

$$\begin{aligned}&\sum _{g} V_{b,d} \cdot x_{s,g,b} \le u_{s,d}\qquad \forall \, s,b,d \end{aligned}$$
(23)
$$\begin{aligned}&\sum _{s,g} V_{b,d} \cdot P_{t,g} \cdot x_{s,g,b} \le u_{t,d}\qquad \forall \, t,b,d \end{aligned}$$
(24)
$$\begin{aligned}&\sum _{d} u_{s,d}- 1 \le v_{s}\qquad \forall \, s \end{aligned}$$
(25)
$$\begin{aligned}&\sum _{d} u_{t,d}- 1 \le v_{t}\qquad \forall \, t \end{aligned}$$
(26)

The entire model for CTP is given in (27).

3.5 IP model for CTP

$$\begin{aligned}Consultation \quad Timetabling \quad Problem \quad IP\qquad \qquad \qquad (\text{27 }) \end{aligned}$$
$$\begin{aligned}&\max \, \sum _{s,g,b} \left( \alpha _{s,g,b} - \sum _t \delta _t \cdot D_{t,b} \cdot P_{g,t} + \delta _s \cdot E_{s,b}\right) \cdot x_{s,g,b} -\sum _{s,d,j} \gamma _{s,j} \cdot vs_{s,d,j}-\sum _{t,d,j} \beta _{t,j} \cdot vt_{t,d,j}\nonumber \\&\qquad -\sum _{t,b,d} \omega \cdot y_{t,b,d} - \sum _s \zeta _s \cdot v_{s}- \sum _t \zeta _t \cdot v_{t}\end{aligned}$$
(27a)
$$\begin{aligned}&\text{ s.t. }\,\sum _b x_{s,g,b} \le R_{s,g} \qquad \forall \, s,g \end{aligned}$$
(27b)
$$\begin{aligned}&\qquad \sum _g x_{s,g,b} \le C_{s,b} \qquad \forall \, s,b \end{aligned}$$
(27c)
$$\begin{aligned}&\qquad \sum _{s,g} P_{g,t}\cdot x_{s,g,b} \le 1 \qquad \forall \, t,b \end{aligned}$$
(27d)
$$\begin{aligned}&\qquad \sum _g \left( x_{s,g,b} + x_{g,b_{+1},s}\right) \le 1 \qquad \forall \, s, d, b \in B\backslash \{b_{|B_J|}\}, HS=1, V_{b,d}=V_{b_{+1},d}=1 \end{aligned}$$
(27e)
$$\begin{aligned}&\qquad \sum _{s} \sum _{\begin{array}{c} b^{\prime } = b\\ V_{b^{\prime },d}=1 \end{array}}^{b + Q} p_{t,b^{\prime },d}- y_{t,b,d} \le Q \qquad \forall \, t,d,b\in B \backslash \{b_j | j > |B| - Q\}, V_{b,d} = 1 \end{aligned}$$
(27f)
$$\begin{aligned}&\qquad \sum _{g} V_{b,d}\cdot x_{s,g,b} = n_{s,b,d}\qquad \forall \, s,b, d, V_{b,d} = 1 \end{aligned}$$
(27g)
$$\begin{aligned}&\qquad |B|_d - (|B|_d - \text{ ord }(b))\cdot n_{s,b,d}\ge f^{\text{ first }}_{s,d}\qquad \forall \, s,b,d, V_{b,d} = 1 \end{aligned}$$
(27h)
$$\begin{aligned}&\qquad \text{ ord }(b)\cdot n_{s,b,d}\le f^{\text{ last }}_{s,d}\qquad \forall \, s,b,d, V_{b,d} = 1 \end{aligned}$$
(27i)
$$\begin{aligned}&\qquad f^{\text{ last }}_{s,d}- f^{\text{ first }}_{s,d}+ 1 - \sum _bn_{s,b,d}\cdot (1 + HS) + HS = z_{s,d}\qquad \forall \, s,d \end{aligned}$$
(27j)
$$\begin{aligned}&\qquad \sum _j vs_{s,d,j}\cdot \text{ ord }(j) = z_{s,d}\qquad \forall \, s,d \end{aligned}$$
(27k)
$$\begin{aligned}&\qquad \sum _j vs_{s,d,j}= 1 \qquad \forall \, s,d \end{aligned}$$
(27l)
$$\begin{aligned}&\qquad \sum _{g,s} V_{b,d}\cdot P_{g,t}\cdot x_{s,g,b} = p_{t,b,d}\qquad \forall \, t,b,d,V_{b,d} = 1 \end{aligned}$$
(27m)
$$\begin{aligned}&\qquad |B|_d - (|B|_d - \text{ ord }(b))\cdot p_{t,b,d}\ge h^{\text{ first }}_{t,d}\qquad \forall \, t,b,d, V_{b,d} = 1 \end{aligned}$$
(27n)
$$\begin{aligned}&\qquad \text{ ord }(b)\cdot p_{t,b,d}\le h^{\text{ last }}_{t,d}\qquad \forall \, t,b,d, V_{b,d} = 1 \end{aligned}$$
(27o)
$$\begin{aligned}&\qquad h^{\text{ last }}_{t,d}- h^{\text{ first }}_{t,d}+ 1 - \sum _bp_{t,b,d}= w_{t,d}\qquad \forall \, t,d \end{aligned}$$
(27p)
$$\begin{aligned}&\qquad \sum _j vt_{t,d,j}\cdot \text{ ord }(j) = w_{t,d}\qquad \forall \, t,d \end{aligned}$$
(27q)
$$\begin{aligned}&\qquad \sum _j vt_{t,d,j}= 1 \qquad \forall \, t,d \end{aligned}$$
(27r)
$$\begin{aligned}&\qquad \sum _{g} V_{b,d} \cdot x_{s,g,b} \le u_{s,d}\qquad \forall \, s,b,d \end{aligned}$$
(27s)
$$\begin{aligned}&\qquad \sum _{s,g} V_{b,d} \cdot P_{t,g} \cdot x_{s,g,b} \le u_{t,d}\qquad \forall \, t,b,d \end{aligned}$$
(27t)
$$\begin{aligned}&\qquad \sum _{d} u_{s,d}- 1 \le v_{s}\qquad \forall \, s \end{aligned}$$
(27u)
$$\begin{aligned}&\qquad \sum _{d} u_{t,d}- 1 \le v_{t}\qquad \forall \, t \end{aligned}$$
(27v)
$$\begin{aligned}&\qquad x_{s,g,b} \in \{0,1\}, \, y_{t,b,i} \in \{0,1\}\end{aligned}$$
(27w)
$$\begin{aligned}&\qquad w_{t,d}\in \mathbb N , \, z_{s,d}\in \mathbb N \end{aligned}$$
(27x)
$$\begin{aligned}&\qquad vt_{t,d,j}\in \{0,1\}, \, vs_{s,d,j}\in \{0,1\}\end{aligned}$$
(27y)
$$\begin{aligned}&\qquad f^{\text{ first }}_{s,d}\in \mathbb N , \, f^{\text{ last }}_{s,d}\in \mathbb N , \, h^{\text{ first }}_{t,d}\in \mathbb N , \, h^{\text{ last }}_{t,d}\in \mathbb N \end{aligned}$$
(27z)
$$\begin{aligned}&\qquad p_{t,b,d}\in \{0,1\}, \, n_{s,b,d}\in \{0,1\} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad (\text{27aa }) \\&\qquad u_{s,d}\in \{0,1\}, \, u_{t,d}\in \{0,1\} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \,\,\, (\text{27ab }) \\&\qquad v_{s}\in \mathbb N , \, v_{t}\in \mathbb N \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \qquad \qquad \qquad \qquad \quad \qquad \quad \quad \quad \,\, (\text{27ac }) \end{aligned}$$

3.6 Complexity

In the following a proof of \(\mathcal NP \)-hardness is given by showing that a well known NP-hard problem, the Graph Coloring Problem (GCP), is polynomially reducible to CTP.

An arbitrary instance of GCP consists of a graph \(G=(V,E)\) and a number of colors \(k\). The decision-version of the GCP asks the following: Does graph \(G\) admit a proper vertex coloring with \(k\) colors, such that no adjacent vertices take the same color?

To answer this question we solve a CTP with parameters \(\delta _t = \delta _s = \gamma _{s,j} = \beta _{t,j} = \omega = \zeta _s = \zeta _t = HS = 0, \alpha _{s,g,b} = 1, C_{s,b} = 1\). This makes all constraints redundant, except for (27b), (27c) and (27d). Further assuming every student has exactly one request, \(\sum _g R_{s,g} = 1\, \forall s\), makes constraint (27c) redundant.

For each vertex \(v \in V\) in graph \(G\), create a student \(s\) and a teachergroup \(g\), and let the meeting request \((s,g)\) represent vertex \(v\). The set of vertices is hence represented by a setting of parameter \(R_{s,g}\). If vertex \(v_1 = (s_1,g_1)\) and vertex \(v_2 = (s_2,g_2)\) are adjacent in graph \(G\), create a teacher \(t\) and assign it to both \(g_1\) and \(g_2\), i.e. \(P_{g_1,t}=P_{g_2,t}=1\). I.e. every teacher will have exactly two meeting requests. Let the set of time slots \(B\) represent the set of colors (such that \(\left| B\right| = k\)).

Hence the GCP-instance is represented by the following CTP instance:

$$\begin{aligned} \max&\quad \sum _{s,g,b} x_{s,g,b}\end{aligned}$$
(28a)
$$\begin{aligned} \text{ s.t. }&\quad \sum _b x_{s,g,b} \le R_{s,g} \qquad \forall \, s,g\end{aligned}$$
(28b)
$$\begin{aligned}&\quad \sum _{s,g} P_{g,t}\cdot x_{s,g,b} \le 1 \qquad \forall \, t,b \end{aligned}$$
(28c)
$$\begin{aligned}&\quad x_{s,g,b} \in \{0,1\} \end{aligned}$$
(28d)

Constraint (28b) specifies that each vertex (meeting request) can at most be assigned one color (time slot). Constraint (28c) specifies that no teacher can be assigned more than one meeting in each time slot, which specifies that no adjacent vertices can take the same color.

To answer the question whether \(G\) is \(k\)-colorable, solve the CTP instance (28) and check if all meeting requests are assigned a time slot, i.e. \(\sum _b x_{s,g,b} = 1 \quad \forall s,g, R_{s,g} = 1\). If so the answer is yes, otherwise the answer is no. Hence the Graph Coloring Problem is polynomially reducible to CTP, and CTP is therefore \(\mathcal NP \)-hard.

3.7 Defining weights

In the following the weights of the model are selected due to the preferences of the Danish high schools. MaCom A/S has greatly assisted this process. Table 1 lists all the weights in the model and their priority.

Table 1 Weight prioritizing

From analysis of previous consultations in the Danish high schools, it is noticed that the students rarely request more than five teacher groups for consultations. And even though the students have the opportunity to request more than five, they seldom use this option. From this analysis it is chosen to stick the request with priority  higher than five to the same weight. This gives the following function for the request weights \(\alpha _{s,g,b}\):

$$\begin{aligned}&\alpha _{s,g,b} = {\left\{ \begin{array}{ll} \kappa _b + 12 - 2\cdot (i - 1) &{} i \le 5 \\ \kappa _b + 2 &{} i \ge 6 \end{array}\right. } \end{aligned}$$
(29)

where \(i \in \mathbb Z ^+\) is the priority of request \((g,s)\). Furthermore there is a set-point for each day given by \(b^*_d\) for which it is desired that the schedule plan for the given day is centered around. Let \(\kappa _b\) denote the penalty for assigning a request to time slot \(b\), defined by

$$\begin{aligned} \kappa _b = - \sum \limits _d\dfrac{V_{b,d} \cdot |b_d^* - b|}{\left| B_d\right| } \end{aligned}$$
(30)

The cost of an undesirable break for a teacher, \(\beta _{t,j}\), is defined as follows,

$$\begin{aligned} \beta _{t,j} =\left\{ \begin{array}{ll} 0 &{} j = 0\\ j &{} j \ge 1 \wedge SCTP \\ j^{1 + \frac{1.5}{\eta _t}} &{} j \ge 1 \wedge PCTP \end{array}\right. \end{aligned}$$
(31)

where \(\eta _t\) is the number of requests for teacher groups where teacher \(t\) is a group member, \(\eta _t = \sum _{s,g} P_{g,t} \cdot R_{s,g}\). I.e. \(\beta _{t,j}\) depends on the number of requests for the given teacher \(t\). The distribution of \(\beta _{t,j}\) is chosen such that a teacher which few students have requested is given a high penalty for undesirable breaks. Likewise, a teacher with many requests has a low penalty for undesirable breaks. This is due to the fact that teachers with many requests will most likely have a more dense schedule, and are therefore not too picky about additional breaks. The reason why there is a difference between the weights for the different consultations types is due to the consultation interval. For the PCTP the consultation meetings are normally located in the evening, and hence we want to penalize the undesirable breaks. The SCTP is typically taken place in the daytime, i.e. the teachers are already at the high school, hence undesirable breaks are not that significant. The weight of an undesirable break for a student \(\gamma _{s,j}\) is analogues,

$$\begin{aligned} \gamma _{s,j} =\left\{ \begin{array}{ll} 0 &{} j = 0\\ j ^{1 + \frac{2}{\eta _s}} &{} j \ge 1 \end{array}\right. \end{aligned}$$
(32)

where \(\eta _s\) is the number of requests of student \(s\).

The cost for violating the length of a sequence for a teacher is given by \(\omega \).

$$\begin{aligned} \omega = 2 \end{aligned}$$
(33)

In our model of the CTP the high school administration selects if interrupting other activities of the students or teachers is allowed. If this is not the case, the costs \(\delta _t\) and \(\delta _s\) are selected as infinitely high (implementation-wise the corresponding constraints are treated as hard-constraints). If interrupting other activities are not allowed, these costs are selected as a constant value,

$$\begin{aligned} \delta _t = \delta _s = \left\{ \begin{array}{ll} \infty &{} \text{ Interrupting } \text{ activities } \text{ not } \text{ allowed } \\ 4 &{} \text{ Interrupting } \text{ activities } \text{ allowed } \wedge PCTP \\ 1 &{} \text{ Interrupting } \text{ activities } \text{ allowed } \wedge SCTP \end{array}\right. \end{aligned}$$
(34)

Like for the undesirable break cost \(\beta _{t,j}\), we distinguish between the two consultations types. For the PCTP it is expensive to interrupt other activities since it is held in the evening and hence other activities are typically other types of meetings. The SCTP is held in the daytime, and it is allowed to ‘lent’ a student from a lecture for a small cost. The penalties for assigning students and teachers to multiple days are given by

$$\begin{aligned}&\zeta _s = 8\\&\zeta _t = 1 \end{aligned}$$

4 Adaptive Large Neighborhood Search (ALNS)

In this section a heuristic alternative to solving the IP-model (27) is described. The performance of these two methods are compared in Sect. 6.

As the local search algorithm we have chosen to use the Large Neighborhood Search (LNS) proposed by Shaw (1998). Most local search algorithms explicitly defines the neighborhood, but the neighborhood in LNS is defined implicitly by a destroy and a repair method. The neighborhood of a solution is then defined as the set of solutions that can be reached by first applying a destroy method and then a repair method. In this article we will use ALNS, in which the LNS is extended by multiple destroy and repair methods. ALNS was first described in Pisinger and Ropke (2005), and has since been used with success on various problems, especially variants of Vehicle Routing Problems (VRP), see e.g. Ropke and Pisinger (2006), Laporte et al. (2010), Azi et al. (2010), Ribeiro and Laporte (2012), and Lei et al. (2011). A pseudo-code for the ALNS heuristic is shown in Algorithm 1.

figure a

The sets of destroy and repair methods are denoted \(\Omega ^-\) and \(\Omega ^+\), respectively. The variable \(\pi \), which stores the weight of all destroy and repair methods, is introduced in line 1. Initially all methods have the same weight. In line 4 the weight parameter \(\pi \) is used to select the destroy and repair methods. In line 10 an accept function evaluates if the new solution should become the new current solution. The accept function can be implemented in different ways. We have chosen to implement a Simulated Annealing-like acceptance criterion, which will be described later.

An ALNS framework has the advantage of using different neighborhoods, such that the algorithm hopefully explores a large part of the solution space. For more information regarding ALNS we recommend Ropke and Pisinger (2006) and Pisinger and Ropke (2010).

4.1 ALNS scoring scheme

A central part of the ALNS algorithm is the scoring scheme of destroy and repair methods. A scoring scheme can essentially be characterized by two central topics; (1) How to quantify the performance of each heuristic. (2) The reaction factor, i.e. how sensitive is the selection process to recent records of performance.

We adapt a scoring scheme based on the technical report of Muller and Spoorendonk (2010), where performance is tracked by the percentage-wise gap between the new found solution and the current solution. This scoring scheme has the advantage of having few parameters to tune, and using the gap between solutions seems as a intuitively good way of measuring heuristic performance. Below the scoring scheme is explained in details.

Runs of the algorithm is divided into segments \(\{t_0, t_1, \ldots , t_n\}\) each consisting of \(N_{\text{ it }}\) iterations. Let \(\pi _i^t\) be the weight of heuristic \(i\) in segment \(t\). The probability of choosing heuristic \(i\) in segment \(t\) is \(\frac{\pi _i^t}{\sum _j \pi _j^t}\). At the end of each segment \(t\), the following update is performed for all heuristics,

$$\begin{aligned} \pi _i^{t+1} = \rho \frac{\bar{\pi }_i^t}{a_i^t} + (1 - \rho ) \pi _i^t \end{aligned}$$
(35)

where \(a_i^t\) is the number of times heuristic \(i\) has been selected in segment \(t\). \(\bar{\pi }_i^t\) is the observed weight of heuristic \(i\) in segment \(t\), which in each iteration is incremented depending on the quality of the new found solution. \(\rho \in [0,1]\) is the reaction factor. A high reaction factor means that the weights of a segment will be very dependent upon the observed weights of the previous segment.

The observed weight \(\bar{\pi }_i^t\) is updated in each iteration. Let \(x\) be the current solution, and \(x^{\prime }\) the new found solution by applying neighborhood \(i\). In the technical report Muller and Spoorendonk (2010) the following formula is used,

$$\begin{aligned}&\text{ gap } = \frac{c(x^{\prime }) - c(x)}{c(x)} \end{aligned}$$
(36)
$$\begin{aligned}&\bar{\pi }_i^t = \bar{\pi }_i^t + m^{\text{ gap }} \end{aligned}$$
(37)

where \(m\) is a constant. We will use a slightly changed version of this formula, since we have observed that the gap formulated by (36) most often yields values of magnitude \(\pm 10^{-4}\), meaning that the observed weight \(\bar{\pi }_i^t\) will rarely change value of significant magnitude. Therefore we introduce a scale parameter in the formula,

$$\begin{aligned} \bar{\pi }_i^t = \bar{\pi }_i^t + m^{\min \left( \sigma \cdot \text{ gap } , 1\right) } \end{aligned}$$
(38)

where \(\sigma \in \mathbb R ^+\) is a parameter that needs tuning. We fix \(m = 5\) and rely on the parameter tuning to set a suitable value for \(\sigma \). The \(\min \)-operator in the exponent of \(m\) is necessary to ensure the weight stay within a reasonable interval, in case we hit an iteration where the scaled gap is big and positive.

4.2 Request removal

The ALNS heuristic for the CTP makes use of two different removal heuristics, each searching a given removal neighborhood. The heuristics takes as input a given solution \(x_{s,g,b}\) and an integer \(q \in \mathbb N \). The output of the heuristics is the solution where \(q\) meetings have been removed. The value of \(q\) is selected as a random number which satisfies,

$$\begin{aligned} 3 \le q \le \max \left( \xi \cdot \sum _{g,s} R_{s,g}, 5 \right) \end{aligned}$$
(39)

where \(\xi \in ]0, 1]\) is the maximum percentage of requests to remove. In accordance with Muller (2009) we decay \(\xi \) over time, starting with a high value \(\xi _{\text{ start }}\) and ending with a smaller \(\xi _{\text{ end }}\). Given the runtime of the algorithm, we divide it into 100 segments such that \(\xi \) is decreased by \(\frac{\xi _{\text{ start }}- \xi _{\text{ end }}}{100}\) in each segment. This decay of \(\xi \) means that the size of the searched neighborhood is progressively reduced. This has the advantage of only performing small changes towards the end of the solution process, where we expect a good solution has been found.

4.2.1 Random removal

The simplest removal heuristic, which randomly removes \(q\) meetings from the solution. This simple heuristic obviously has the effect of diversifying the search.

4.2.2 Shaw removal

This removal heuristic was introduced in Shaw (1997, 1998) where it is used on the VRP. In this section the heuristic is modified to suit the CTP. The general idea of the heuristic is to remove meetings which are somehow related, since there is a good chance that such requests can swap positions and possibly improve the solution. In this paper two factors determine if meeting \(i\) is related to meeting \(j\): Similarity between students, and similarity between teachers. \(S_i\) and \(T_i\) indicates the set of students and teachers of meeting \(i\), respectively. Notice that a meeting always contains exactly one student, i.e. \(|S_i| = 1\). Let the measure of relatedness between meeting \(i\) and \(j\) be defined by \(\mathcal M (i,j) \in [0,1]\),

$$\begin{aligned} \mathcal M (i,j) = \frac{\left| T_{i} \cap T_{j} \right| + \left| S_{i} \cap S_{j} \right| }{\min (|T_i|, |T_j|) + 1} \end{aligned}$$
(40)

I.e. relatedness is the percentage of individuals which is shared between the meetings, such that a high value of \(\mathcal M \) means that the meetings are very related. This simple formulation of relatedness is done without any additional parameters. An alternative natural formulation would be to scale the student-relatedness and teacher-relatedness by two independent parameters. However we have chosen the shown formula due to its simplicity. An addition to the formula could be to introduce a term which determines time slot relatedness, although it should be noted that relatedness between slots is only directly relevant if there also exists some relatedness between students or teachers.

A pseudo code for Shaw removal is shown in Algorithm 2.

figure b

To avoid the situation where the same meetings are removed over and over, the algorithm is randomized. The level of randomness is controlled by the parameter \(p_{\text{ shaw }}\in \mathbb R ^+, p_{\text{ shaw }}\ge 1\). This means that \(p_{\text{ shaw }}\) somehow defines how random the element is chosen, where \(p_{\text{ shaw }}=1\) corresponds to completely random.

4.3 Repair heuristic

The repair heuristics are given a set of consultation meetings and a set of not granted meeting-requests.

4.3.1 Basic greedy heuristic

A trivial algorithm for the CTP is a simple greedy algorithm which places one request at a time in order of contribution to the objective. In each iteration of the algorithm this process is repeated until no more requests which improves the solution can be inserted. Implementation wise the algorithm suffers from cost-dependencies, since the contribution of inserting each request possibly changes after each insertion. This is slightly optimized by only recalculating the cost of those requests which the last insertion can possibly effect. I.e. recalculate the cost of those requests which has the same student as the last insertion, or if the teacher group overlaps with the one of the last insertion. This repair heuristic is used to create an initial feasible solution for the CTP.

4.3.2 Regret heuristics

The regret heuristic improves the basic greedy by incorporating a kind of look-ahead information when selecting a request to insert. Informally speaking, the heuristic aims at inserting the request which we will regret most if not inserted immediately. The regret heuristic has been used by Potvin and Rousseau (1993) and Pisinger and Ropke (2005) for the VRP with Time Windows. Let \(c_{r}^k\) denote the change in the objective value by inserting request \(r\) into the \(k^{\text{ th }}\) best position. E.g. \(c_r^2\) denotes the change in the objective value by inserting request \(r\) in the second best position. A Regret-2 heuristic will in each iteration choose to insert the request \(r\) where the difference between best and second best position is largest, i.e.

$$\begin{aligned} r := \mathop {\text{ arg } \text{ max }}\limits _{r \in R_g^s, c_r^1 > 0}\left( c_r^1 - c_r^2\right) \end{aligned}$$
(41)

The request \(r\) is inserted at its best position, so we restrict the heuristic to only look at requests where the best position is actually feasible and yields a positive change in objective. This restriction is necessary since the objective of the CTP contains both a minimization and a maximization part, and we are not interested in inserting requests which have negative impact on the objective. The heuristic can be extended by looking at \(k\) positions for each request. The request to insert is then chosen according to

$$\begin{aligned} r := \mathop {\text{ arg } \text{ max }}\limits _{r \in R_g^s, c_r^1 > 0}\sum _{h=2}^k\left( c_r^1 - c_r^h\right) \end{aligned}$$
(42)

We will in this paper incorporate the regret heuristic for several choices of \(k\). The basic greedy algorithm from the previous section is a Regret-1 heuristic due to the tie-breaking rules. For a Regret-1 heuristic the most profitable request is inserted in each iteration. Most papers distinguish between Regret-1 and other regret heuristic, however implementation wise they are not very different. Setting \(k=|B|\) corresponds to the full Regret-\(k\) heuristic.

Even though the regret heuristic is designed for VRP, it seems well suited for the CTP due to its assignment character. It seems valuable to attempt to predict which request is the most critical to insert. By some basic tests, we have chosen to use Regret-2, Regret-3, and Regret-\(|B|\) as insertion heuristics.

4.4 Algorithm setup

According to Ropke and Pisinger (2006), using myopic repair heuristics, like those of this paper, one may apply noise to the objective function to obtain a more efficient algorithm. By applying noise, the repair heuristic will not always make the move that seems best locally. Ropke and Pisinger (2006) support this by strong computational results. However, preliminary tests show that, in our case, adding noise does not yield a more efficient algorithm. More precisely, noise was added such that it was controlled by a linear-scale parameter, and excessive tuning on this parameter yielded no convergence at all. I.e. this parameter had (close to) no impact on the algorithm efficiency. A similar result for the Cumulative Capacitated VRP is reported in Ribeiro and Laporte (2012).

In occurrence with Ropke and Pisinger (2006) we borrow an acceptance criteria from Simulated Annealing (SA). A solution \(x\) is always accepted if \(c(x) > c(x_{\text{ best }})\). If \(c(x) < c(x_{\text{ best }})\) then \(x\) is accepted with probability \(\exp \left( -\frac{c(x_{\text{ best }}) - c(x)}{T}\right) \). In each iteration the temperature \(T\) is updated by \(T= d_{\text{ SA }}T\), where \(0 <d_{\text{ SA }}< 1\). Giving the temperature control parameter \(w_{\text{ SA }}\), \(0 < w_{\text{ SA }}< 1\), \(T\) is initially selected such that a solution is accepted with probability \(\tfrac{1}{2}\) if its change in objective is \(w_{\text{ SA }}\) percent worse than the initial solution \(x_{0}\), i.e.

$$\begin{aligned}&\exp \left( -\frac{(c(x_{0}) - (1-w_{\text{ SA }})\cdot c(x_{0}))}{T_0}\right) = \tfrac{1}{2}\quad \Rightarrow \quad T_0 = \frac{w_{\text{ SA }}\cdot c(x_{0})}{\ln (2)} \end{aligned}$$
(43)

This has the advantage of better adapting the temperature to each dataset.

Furthermore at the start of each segment (those of the ALNS scoring scheme), the current solution is set to the current best.

5 Parameter tuning

The proposed heuristic contains many free parameters. It is essential that these parameters are tuned to achieve good performance, see e.g. Adenso-Diaz and Laguna (2006) and Diao et al. (2003). Tuning of metaheuristics is usually done by rules-of-thumb and the researchers personal experience. However some well performing automated algorithms have lately been introduced, mainly ParamILS (Hutter et al. 2009) and Race-algorithms (Birattari 2005). In this paper we will use the F-Race algorithm for tuning, as its implementation burden seems light, and it has proven competitive for some heuristic methods, see Montero et al. (2010) and Pellegrini et al. (2010).

The main idea of a race algorithm is to sequentially process a set of data instances using all possible parameter configurations. In each iteration, the parameter configurations which are statistically inferior are eliminated. The algorithm is ran until one parameter configuration remains or the specified time limit is exceeded, see Algorithm 3. If more than one parameter configuration remains once the algorithm terminates, the one which in average has performed best is selected. The advantage of a race algorithm is that bad parameter configurations are eliminated early, such that no more valuable computation time is spend on evaluating these. The racing algorithm differs from most other tuning approaches in the sense that it only performs one algorithm run per parameter configuration per data instances. This relies on the proof in Birattari (2005) where it is shown that this is the optimal experimental setting in terms of variance of the estimated performance.

figure c

In a F-Race algorithm, the Friedman Two-way Analysis of Variance by Ranks test is used to determine whether there is sufficient statistically evidence to eliminate parameter configurations from future iterations. If this is the case, then post-tests are performed where pairwise comparison between the best candidate and the remaining determines which configurations should be eliminated, if any. The F-Race algorithm has been successfully used for tuning in a number of cases, see e.g. Chiarandini et al. (2006), Becker et al. (2006) and Pellegrini and Birattari (2007).

A problem of the described Racing algorithm, which applies to most tuning frameworks, is the so-called full-factorial design, meaning that the full set of parameter configuration is initially considered. This results in the F-Race becoming impractical and computational prohibitive, if there exists a large number of parameters and each parameters can take a modest number of values. This has been addressed in Balaprakash et al. (2007) by defining a probabilistic model on the set of all possible parameter configurations, such that a small set of parameter configurations is generated in each iteration of the tuning process. Elite configurations are used to update the model to bias the search around high quality parameter configurations. This version of F-Race is denoted Iterative F-Race (I/F-Race).

In this paper we use a simplified I/F-Race algorithm, where we start out with a small subset of parameter configurations, and based on the Race-results of these we manually construct new configurations, which are believed to be superior. One could think of this approach as a sort of manual iterative F-Race. Table 2 shows the best found parameter configuration. It should be mentioned that we set \(b_{\text{ SP }} = b_0\) and \(\delta ^s = \delta ^t = \infty \), since these are the most common values chosen by the users. The datasets used are of the school year 2011/2012. \(w_{\text{ SA }}\) is the temperature control parameter and \(d_{\text{ SA }}\) is the decay parameter for the SA based acceptance criteria. \(N_{\text{ it }}\) defines the number of iterations between resets. \(\rho \) and \(\sigma \) are reaction factor and the scale parameter for the ALNS scoring scheme, respectively. \(\xi _{\text{ start }}\) and \(\xi _{\text{ end }}\) are the destroy percentage in the beginning and in the end of the running time. Finally, \(p_{\text{ shaw }}\) indicates how random the element is chosen in the Shaw removal.

Table 2 Final values of tuned parameters, found by the F-Race algorithm with confidence level \(\alpha = 0.05\)

6 Performance

The goal of this section is to evaluate the performance of the developed solution methods, the ALNS algorithm and solving the IP model. Also a comparison with the existing heuristic of Lectio is made. All tests are performed using nUnit 2.6 in C# 4.0 on a machine with an Intel i7-930@2.8 GHz CPU and 12 GB of RAM. No parallelization has been implemented.

6.1 Performance comparison between ALNS and Gurobi

In the following, the performance of the state-of-the-art MIP solver Gurobi 5.01 (currently top-ranked in the MIP benchmark of Mittelman 2012) and the implemented ALNS algorithm are compared. For both the PCTP and the SCTP, 100 datasets from the school-year 2011/2012 are selected from the database of Lectio.

In this experimental setup, the ALNS algorithm is run for 2 min. This low running time is due to the following: (1) The schools does generally not expect an algorithm to run longer, as they are usually not aware that it is a hard problem to solve. Some even believe the problem is trivial. (2) The ALNS “tailors-off” after a while, i.e only minor improvements are seen on the best found solution after the 2-min mark.

The Gurobi solver is run for 1 h, because we do not only want to evaluate the performance in terms of best found IP solution, we also want a good upper bound for the instances.

In the performance tests it is not allowed to interrupt other activities for the PCTP, i.e. \(\delta _t = \delta _s = \infty .\) For the SCTP interrupting activities is allowed. This is due to the fact that PCTP is normally arranged in the evening while SCTP is during the normal work-hours.

From Table 3 it is seen that ALNS in average finds solutions 4 % from optimum. Even though ALNS has far lower running time than Gurobi, it finds better solutions in almost all cases.

Table 3 Gurobi 5.01 and ALNS for the PCTP on 100 datasets

Table 4 shows the performance for the SCTP.

Table 4 Gurobi and ALNS for the SCTP on 100 datasets

From Table 4 it is seen that ALNS in average finds solutions 4.1 % from optimum for the SCTP. This is lower than the average gap for Gurobi, which is 7.7 %.

From Table 3 it is seen that ALNS outperforms Gurobi for the PCTP. For the SCTP, the results are more blurred, but the ALNS still performs best in 70 out of 100 cases. What can also be seen from Tables 3 and 4 is that the standard deviation for the ALNS is low in all cases, and the maximum gap obtained across all datasets is considerably lower than the maximum gap which Gurobi obtains (even given the higher running time of Gurobi). This is important as the customers of Lectio expects a consistent and stable solution procedure.

6.2 Performance comparison of ALNS and current heuristic of Lectio

The current algorithm in Lectio is an undocumented heuristic, which initially attempts to fulfill every meeting request by assigning them to random time slots, and then attempts to find improving solutions with a hill-climber embedded in a SA framework. This heuristic does not support the SCTP. In this section, we compare the existing heuristic of Lectio with the implemented ALNS algorithm.

The comparison of algorithms for the PCTP is done by adapting the objective of the ALNS so it matches the one of the implemented SA algorithm, which yields the following changes:

  • Since the SA algorithm attempts to fulfill all meeting requests, we set \(\alpha _g^s\) to a huge value for all meeting requests, effectively making the ALNS behave the same way.

  • We set \(\beta ^t = \gamma ^s = 2\).

  • The SA algorithm does not allow interrupting of activities, i.e. \(\delta ^t = \delta ^s = \infty \).

  • The time slot set-point setting of the SA solver is broken, so we set \(\kappa = 0\) and likewise for the SA solver.

  • The violation of sequence length for teachers is penalized in quadratic way. This means that term (21) is now written as \(-\sum _{t,b,d} \left( y_{t,b,d}\right) ^{\omega }\), and \(\omega = 2\).

We evaluate the algorithms on 100 randomly selected datasets for the school-year 2009/2010. The reason a new batch of datasets is selected for this test is that the existing heuristic of Lectio does not support all features mentioned in this paper. Due to customer requests, Lectio is continuously developed, and this also effects the CTP. E.g. datasets from the school-year 2009/2010 does not support features such as multiple days for a consultation.

Experience has shown that the SA algorithm needs long runtime to provide meaningful solutions. We set runtime equal to 10 min, which is significantly higher than the preferable runtime of the high schools, as described in Sect. 6.1. Furthermore, to reduce the influence of stochastic behavior, we perform 10 runs on each dataset with each solver.

Table 5 shows the average performance, the standard deviation and the number of unassigned requests of both algorithms. Recall that in this test both algorithms attempt to fulfill every meeting request. However it cannot be guaranteed that the algorithms can find a solution which satisfies this, nor that it even exists. Furthermore we compare the algorithms in the domain of the SA algorithm, and in this domain it is only attempted to minimize the different costs, i.e. no benefit is made from fulfilling meeting requests. This means it would not be fair to compare solutions which does not have the same amount of unassigned requests, since additional fulfilled requests will potentially yield additional cost of e.g. number of breaks, interrupted activities, etc. Therefore we enforce the following criterion to determine whether the comparison of solutions for a dataset is valid: The difference in the number of unassigned requests must lie in the interval \(\pm 1\). This means the comparison of solvers in Table 5 is only approximate, however it can be considered as a very good approximation, since a difference of one fulfilled meeting request will have minor influence of the objective. Notice that the objectives are given in the domain of the SA algorithm, which is of different magnitude than the objective of this article. This is due to the fact that the undocumented heuristic aims at minimizing whereas the approach of this paper is to maximize.

Table 5 Comparison of performance of the SA algorithm and the ALNS algorithm

Given the average objective of the SA algorithm \(\bar{x}_{\text{ SA }}\) and the average objective of the ALNS algorithm \(\bar{x}_{\text{ ALNS }}\), and that the comparison of these two is valid, we compute the difference \(\bar{x}_{\text{ SA }} - \bar{x}_{\text{ ALNS }}.\) For almost every instance where comparison is valid, the ALNS algorithm in average finds a better solution. Furthermore the solutions from the ALNS algorithm has far lower deviation than the SA algorithm, which are important, as the users of the algorithm will usually only run the algorithm once.

By the computational tests of this section, it has been shown that the ALNS algorithm is the best solution procedure, of those considered in this paper, for the CTP. It outperforms both Gurobi and the existing heuristic of Lectio in terms of both solution quality and reliability.

7 Final remarks and outlook

It has been shown how the CTP, an important real-life problem for the Danish high schools, can be modeled using linear IP. ALNS has proven successful in establishing solutions for two versions of the problem, the PCTP and the SCTP. Furthermore, F-Race has shown to be an efficient method for tuning of the free parameters. The developed ALNS algorithm has been implemented in Lectio and is hence available for 95 % of the Danish high schools.

In case of the PCTP, it has been shown that the ALNS algorithm in average finds solutions which are less than 4 % from optimum. This average is taken over 100 real-life dataset, and therefore we have high confidence in this result. Furthermore it has been shown that comparing with the existing algorithm in Lectio, which is the only other known heuristic algorithm for the problem, the ALNS algorithm is far superior. For 83 of the 86 datasets, ALNS finds better solutions, and in many cases the solution quality of the ALNS is considerably better. For the remaining 14 datasets a comparison was not considered fair.

The performance for the SCTP is also tested on 100 real-life dataset. For these datasets, it is shown that the ALNS algorithm in average finds solutions less than 5 % from optimum.

For both the PCTP and SCTP the average solution found by ALNS is better than the solutions found by the state-of-the-art MIP solver Gurobi 5.01.

The main subject for further research is considered to be the use of Dantzig-Wolfe decomposition and solution using Branch-and-Price. In this context, a column in the master problem could represent a meeting-plan for a student or a teacher. This would move many constraints to the subproblem, possibly giving a stronger IP formulation, which could lead to a more efficient IP-based solution approach.

Another possibility for future research is to combine the two solution methods described, i.e. using the MIP solver as a repair heuristic within the ALNS. Similar approaches are seen in Muller et al. (2011) and Prescott-Gagnon et al. (2009), with competitive results.