1 Introduction

Nowadays the pressure on hospitals to work in a cost efficient way is continuously increasing [22, 33]. As repeatedly reported, up to one third of all German hospitals expect a reduction in staff to reduce cost [8]. One major challenge in this context is the long-term personnel planning for different specialities in hospitals. Especially in anesthesia departments, the allocation of anesthetists with different experience levels to surgeries is a challenging task [9]. Additionally, providing a backup system for emergency cases makes the scheduling problem even more difficult. Nevertheless, most departments schedule its physicians by hand at great cost and time. Furthermore, normally the creation of working schedules is carried out by a physician with high experience level though he is needed in the operating room (OR). Accumulation of overtime up to 150 h per month and dissatisfied physicians are the consequences [9]. Since 2007 new labor regulations have made it even more difficult to find a feasible schedule without violating these restrictions [21].

So, there is a great need for the development of efficient scheduling procedures that better meet supply with demand. Most of the work done in this field is about nurse scheduling and rostering [11]. However, many models for nurse scheduling cannot be applied for physician scheduling in a straightforward way. Reasons are complex and include high specialization of physicians, level of experience as well as a great variety of labor contracts which can differ by a wide spectrum of clauses arranged. All these conditions make it difficult to generate a satisfying roster for physicians. Nevertheless, some work tackle the more complex task of physician scheduling [1, 12, 28, 29]. For instance, Brunner et al. [9] present a mixed-integer programming model (MIP) for short-term planning. The objective is to minimize the paid out time using flexible shifts. The MIP describes an optimal situation where each physician has the highest experience level and hence is able to handle each task in the OR. But, the reality in a university hospital does not reflect this situation completely. The history of patients is very complex and hence does not allow for generalization. Furthermore, university hospitals are instructed to train physicians with low experience in order to maintain quality of treatment for future times. Consequently, university hospitals have to deal with different experience levels of its staff. One major task of unit managers is to determine the number staff for each experience level to meet demand in the OR and hence provide sufficient training – which is the focus of our work. To avoid any ambiguity throughout the paper, we use the following definitions.

  • Flexible shifts: Shifts may have arbitrary shift lengths and different shift starting times in accordance with labor regulations.

  • Specialist: A certified physician with high experience level who has already finished his training. The physician is able to handle each surgery.

  • Resident: A physician with low experience who is still in training. The physician cannot handle a surgery that needs a higher experience level than his experience level.

In this paper we address the long-term scheduling problem of defining a capacity profile which determines an optimal number of physicians with different experience levels, i.e. in case of two experience levels we consider specialists and residents, in order to meet a given demand profile. Note that the model is also valid for any number of different experience levels. Our objective is to minimize staffing costs. Salaries are calculated as average for each experience level according to German labor contracts [21]. The main contribution is the development of the MIP formulation and the implementation of a column generation based heuristic to find high quality solutions. The output can be used by administrators to make long-term decisions regarding hiring or dismissing physicians associated with different experience levels.

The remainder of this paper is structured as follows. In Section 2 we review the relevant research. We give a detailed problem description and a general MIP formulation in Section 3. Section 4 shows the decomposition strategy in terms of a column generation based heuristic to solve the underlying problem. Computational results using real data provided by our cooperation university hospital – München Rechts der Isar (MRI) – are displayed in Section 5. Here we highlight the efficiency of the proposed heuristic as well as the usage of the model for hospital management. In Section 6 we summarize our main findings and we give a short outlook

2 Literature

In the last two decades personnel scheduling problems have been investigated in very detail. Consequently, there exists a vast amount of published work in this area. Our review will be limited to the most relevant articles for our work. A general review of personnel scheduling can be found in Ernst et al. [17]. This paper classifies each personnel scheduling problem into one or more of six modules (namely demand forecasting, days-off scheduling, shift scheduling, line-of-work construction, task assignment, and staff assignment). Normally, the six modules are considered sequentially by planners. A general tutorial on staff scheduling problems is given in [7]. Most research on personnel scheduling in health care concentrates on scheduling nurses [11, 14, 17, 18]. However, scheduling physicians has got more attention in the recent past [9].

A MIP formulation for scheduling residents in a large teaching hospital is introduced by Franz and Miller [19]. Their procedure gives a systematic way to solve system inherent infeasibilities. Furthermore, a rounding heuristic is used to find feasible schedules. In related work, Beaulieu et al. [1] introduce a shift based model for scheduling emergency room physicians. The 6 month planning horizon is decomposed in their solution procedure. Several kinds of requests for days-off and physician preferences are incorporated in the formulation. Cohn et al. [15] schedule the teaching phase of special training programs at Boston University School of Medicine. The output consists of 1-year schedules with on-call service and vacation considerations.

White and White [36] assign hospital rounds by specialty for teams containing senior, junior physicians as well as residents. They implement a metaheuristics in combination with a logic constraint algorithm to obtain monthly schedules. Ovchinnikov and Milner [25] apply spreadsheet modeling for the scheduling of residents over a 1-year horizon in radiology at the University of Vermont’s College of Medicine. In their opinion, spreadsheet models are easy to implement for small size problems and are accepted by the users. The investigation of several scheduling procedures for emergency physicians currently in use at six different hospitals is given in [12].

Implicit shift scheduling approaches are also studied in the literature of personnel scheduling. For instance, some authors tackle a single-day problem with linear programming techniques [23, 32]. Cezik et al. [13] implement these ideas within a days-off scheduling framework to generate weekly tours. Implicit modeling techniques for including lunch breaks have also gathered some attention [3].

Building upon these ideas, Brunner et al. [9] introduce an implicit formulation for the flexible shift scheduling problem of physicians in hospitals. The method generates shifts during the course of the solution procedure. To solve the problem, they use a heuristic decomposition strategy. In a subsequent work, they develop a branch-and-price algorithm with two branching schemes for the problem [10]. They extend the model and include part-time workers. A broad test design using real data as well as random data is used to evaluate the algorithm for up to 6 week planning horizons. Stolletz and Brunner [31] use a different modeling approach to solve the same flexible shift scheduling problem of physicians introduced in [9]. Additionally, the formulation is extended to incorporate fairness.

Rousseau et al. [28] use three approaches, namely constraint programming, local search, and genetic algorithms to solve a physician scheduling problem. Another technique called goal programming is applied in [34]. Monthly tours are generated for emergency medical residents with two different shift types, i.e. a 10-hour day and a 14-hour night shift. The largest instance solved reflects the problem size for 23 residents and a 31-day planning horizon. More insights into the underlying problem as well as a real application at a pulmonary unit of a local hospital are presented in a subsequent work [35]. The problem of allocating night shifts to residents under consideration of preferences and skill requirements is tackled in [29]. Several heuristics are employed to solve the MIP formulation.

Column generation as a solution technique is often used to solve personnel scheduling problems. Bard and Purnomo [2] apply a column generation scheme to schedule nurses with individual preferences. Similarly, Jaumard et al. [20] use an exact branch-and-price algorithm for the midterm nurse scheduling problem. The subproblem is represented as a shortest path problem and is solved using dynamic programming. Final rosters are obtained for departments with a dozen nurses. An integrated nurse scheduling and operating room procedure based on column generation is introduced in [6]. The subproblem that generates new rosters for the nurses is formulated as a shortest path problem. The other subproblem that constructs new patterns for surgeries is a general integer programming formulation. The algorithm provides lower and upper bounds on the number of nurses. In another work, the same authors present two different decomposition schemes when creating the components of a column generation procedure [5]. Various other researches report successful implementations of column generation schemes [4, 10, 27].

The literature review indicates that normally a fixed number of staff is assumed and scheduling flexibility in terms of various shift starting times and different shift lengths is often not considered. To the best of our knowledge, we are not aware of any work that uses flexible shift scheduling for staffing decisions of physicians in hospitals within a column generation framework. In the following, we present a new long-term staffing model with flexible shifts that are modeled implicitly.

3 Problem description and development of mixed-integer programming model

In the German health care system every patient has the right to receive treatment that is in accordance with the standard of a specialist [30]. To ensure this, residents have to be under the control of a specialist who ensures an adequate level of training and provides a backup system for minor problems as well as emergencies during operations. At MRI the number of operations with anesthesiological care has increased from 16,700 operations per year in 2003 to 24,000 operations per year in 2009. The performance report can be requested by clinical center of anesthesia at MRI [24]. Consequently, the number of anesthetists in the department has grown in the same time. However, residents who are at the beginning of their professional training contribute the highest number to the hired staff. Hiring physicians in Germany has become a problem over the last years. Especially, specialists are a scarce resource because many of them decide to work abroad or turn away from medicine in order to work in different occupational areas [26]. The main problem faced by hospital administrators is to define an optimal number of physicians for each experience level that minimizes costs and meets a given demand of operations without violating existing labor law contracts.

We consider a strategic context in that we decide about the composition of the staff with different experience levels that should be hired to cover varying demand. The problem consists of different experience levels eE. Physicians with experience level e work according to certain working characteristics, such as minimum shift length, maximum shift length, minimum rest duration between shifts, working time per week, wage, as specified by general labor rules or individual agreements between the physicians and the hospital. Furthermore, physicians are scheduled over a planning horizon of wW weeks where each week is divided in pP periods, normally stated in 1-hour increments. In our planning model, we consider normal working days Monday through Friday. On the other side, we have multiple time varying demands d ewp for all eE, wW, and pP. Each demand is characterized by experience level e. Consequently, demand in any period can only be covered by a physician iI e who has at least the required or a higher experience level. We assume that the labor market provides enough staff for the hospital which is a critical assumption as mentioned above. Our goal is to find the best composition of staff with individual experience levels to cover demand subject to labor costs.

To formulate the strategic model we need some sets of decision variables. The first binary variable signals if physician i with experience e is hired in week w and is identified by y eiw . Similar to this decision variable, we need one that decides if physician i with experience e is on duty in period p and week w. We model the decision by a binary variable that is denoted with x eiwp , ∀ eE, iI e , wW, pP. Furthermore, to enforce minimum and maximum shift lengths as well as maximum rest lengths we need one more variable that characterizes a shift start. The variable is denoted by \( y_{{eiwp}}^{{shift}} \) and is set to 1 if a shift starts in period p and week w. From a modeling perspective the variable \( y_{{eiwp}}^{{shift}} \) indicates a transfer from a 0 to a 1 in the x-variables. To state the strategic MIP model we use the following notation in our developments.

Sets with indices

E :

set of experience levels (index e)

W :

set of weeks in the planning horizon (index w)

P :

set of periods in a planning week (index p)

I e :

set of physicians with experience level e that can be hired (index e)

Parameters

d ewp :

demand in week w and period p that requires at least experience level e

r e :

regular working hours per week for physicians with experience e

C e :

fixed cost for hiring physicians with experience e in any week

\( \bar{P}_e^{{shift}} \) :

maximum shift length for physicians with experience e

\( \underline P_e^{{shift}} \) :

minimum shift length for physicians with experience e

\( \underline P_e^{{rest}} \) :

minimum rest length after a shift ends for physicians with experience e

Binary decision variables

x eiwp :

1 if physician i with experience e is on duty in week w and period p, 0 otherwise

\( y_{{eiwp}}^{{shift}} \) :

1 if physician i with experience e begins a shift in week w and period p, 0 otherwise

y eiw :

1 if physician i with experience e is hired in planning week w, 0 otherwise

$$ {\text{Minimize}}\,\sum\limits_{{w \in W}} {\sum\limits_{{e \in E}} {\sum\limits_{{i \in {I_e}}} {{c_e}{y_{{eiw}}}} } } $$
(1a)

subject to

$$ \begin{array}{*{20}{c}} {y_{{eiwp}}^{{shift}} = {x_{{eiwp}}}\left( {1 - {x_{{eiwp - 1}}}} \right)} & {\forall \,e\, \in \,E,\,i\, \in \,{I_e},\,w \in \,W,\,p \in \,P} \\ \end{array} $$
(1b)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p}}^{{p + \underline P_e^{{shift}} - 1}} {{x_{{eiw\rho }}} \geqslant \underline P_e^{{shift}}y_{{eiwp}}^{{shift}}\,} } & {\forall e\, \in \,E,\,i\, \in \,{I_e},\,w \in W,p \in \left\{ {1, \ldots, \left| P \right| - \underline P_e^{{shift}}} \right\}} \\ \end{array} $$
(1c)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p}}^{{\left| P \right|}} {{x_{{eiw\rho }}}} \geqslant \left( {\left| P \right|-{ }p{ } + { 1}} \right)y_{{eiwp}}^{{shift}}} & {\forall \,e\, \in \,E,\,i\, \in \,{I_e},w \in W,\,p \in \left\{ {\left| P \right| - \underline P_e^{{shift}} + 1, \ldots \left| P \right|} \right\}} \\ \end{array} $$
(1d)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p + \underline P_e^{{shift}}}}^{{p + \overline P_e^{{shift}}}} {\left( {1 - {x_{{eiw\rho }}}} \right) \geqslant y_{{eiwp}}^{{shift}}} } & {\forall \,e\, \in \,E,\,i\, \in \,{I_e},w\, \in \,W,p\, \in \,\left\{ {1, \ldots, \left| P \right| - \overline P_e^{{shift}}} \right\}} \\ \end{array} $$
(1e)
$$ \begin{gathered} \hfill \\ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = 1}}^{{p - 1}} {\left( {1 - {x_{{eiw\rho }}}} \right)} \geqslant \left( {p - {1}} \right)y_{{eiwp}}^{{shift}}} & {\forall \,e\, \in \,E,\,i\, \in \,{I_e},w\, \in \,W,p\, \in \left\{ {1, \ldots, \underline P_e^{{rest}}} \right\}} \\ \end{array} \hfill \\ \end{gathered} $$
(1f)
$$ \begin{gathered} \hfill \\ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p - \underline P_e^{{rest}}}}^{{p - 1}} {\left( {1 - {x_{{eiw\rho }}}} \right)} \geqslant \underline P_e^{{rest}}y_{{eiwp}}^{{shift}}} & {\forall \,e\, \in \,E,\,i\, \in \,{I_e},w\, \in \,W,p\, \in \left\{ {\underline P_e^{{rest}} + 1, \ldots, \left| P \right|} \right\}} \\ \end{array} \hfill \\ \end{gathered} $$
(1g)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{p \in P}} {{x_{{eiwp}}}} \leqslant {r_e}{y_{{eiw}}}} & {\forall \,e\, \in \,E,\,i\, \in \,{I_e},w\, \in \,W} \\ \end{array} $$
(1h)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{\varepsilon = 1}}^e {\sum\limits_{{i \in {I_{\varepsilon }}}} {{x_{{\varepsilon iwp}}}} } \geqslant \sum\limits_{{\varepsilon = 1}}^e {{d_{{\varepsilon wp}}}} } & {\forall \,e\, \in \,E,\,w\, \in \,W,p\, \in \,P} \\ \end{array} $$
(1i)
$$ \begin{array}{*{20}{c}} {{x_{{eiwp}}},{y_{{eiw}}},y_{{eiwp}}^{{shift}} \in \left\{ {0,1} \right\}} & {\forall \,e\, \in \,E,\,i\, \in \,{I_j},w\, \in \,W} \\ \end{array}, p \in P \cup \left\{ 0 \right\} $$
(1j)

The objective function (1a) minimizes the cost of hiring physicians with different experience levels that are needed to provide demand coverage. The goal is to find the right composition of staff with different experience levels. Hence, the cost coefficients must assure the logic that physicians with higher experience cost more than physicians with lower experience, i.e. formally spoken \( \forall {e_1},{e_2} \in E:{e_1} \leqslant {e_2} \leftrightarrow {c_{{{e_2}}}} < {c_{{{e_1}}}} \). Note that experience levels are numbered with the highest starting by 1 and then incremented with the next lower level.

Constraints (1b) link the x-variables with the y-variables. In detail, constraints (1b) determine a shift start, i.e. if there is a transfer from a 0 to a 1 in the x-variables then the corresponding y-variable is forced to 1. A linear form of constraints (1b) is given in the Appendix. Initial conditions for x eiw0 have to be provided.

The next block of constraints (1c) – (1g) implicitly defines the shifts according to the labor agreements for physicians with experience e. Constraints (1c) and (1d) force a minimum shift length whereas constraints (1e) assure a maximum shift length. In other words, if a shift starts in period p, i.e. \( y_{{eiwp}}^{{shift}} = {1} \), then constraints (1c) force that all x-variables for periods p up to \( p + \underline P_e^{{shift}} - 1 \) are set to 1 whereas constraints (1e) ensure that at least one of the x-variables for periods \( p + \underline P_e^{{shift}} \) to \( p + \bar{P}_e^{{shift}} \) is set to 0. For instance, consider a physician i with experience level e in week w. Furthermore, assume that a shift starts in period 5, i.e. \( y_{{eiw5}}^{{shift}} = {1} \), and the minimum (maximum) shift length is 6 (10) periods, i.e. \( \underline P_e^{{shift}} = {6} \) and \( \bar{P}_e^{{shift}} = {1}0 \), respectively. Constraints (1c) force that all x-variables for periods 5 to 10 are set to 1, i.e. \( {x_{{eiw}}}_{{5}} = {x_{{eiw}}}_{{6}} = {x_{{eiw}}}_{{7}} = {x_{{eiw}}}_{{8}} = {x_{{eiw}}}_{{9}} = {x_{{eiw}}}_{{{1}0}} = {1} \). In contrast, constraints (1e) ensure that at least one of the variables x eiw11 = x eiw12 = x eiw13 = x eiw14 = x eiw15 has to be 0 which corresponds to a shift end. Note that constraints (1d) are necessary to account for the end of the planning horizon. A minimum rest period before a shift starts is enforced by constraints (1f) and (1g) where constraints (1f) handle the beginning of the planning horizon. Note, to enforce a minimum rest time before a shift starts is equivalent to a constraint that enforces a minimum rest period after a shift ends. However, our modeling does not need an additional decision variable that indicates the end of a shift.

Constraints (1h) set the indicator variables y eiw to 1 in case physician i with experience e is on duty in planning week w. In other words, when one of the on duty variables x eiwp is set to 1 then the associated physician is hired by the hospital in the corresponding week. The demand constraints (1i) ensure that the demand is covered with appropriate staff. Recall, those physicians with higher experience levels can cover demand with lower experience levels but not vice versa. Variable definitions are given in (1j). Remark, the conditions on \( y_{{eiwp}}^{{shift}} \) can be relaxed since the variables will be integral in any final solution. This is a direct result of constraints (1b) and the fact that the x-variables are binary.

4 Solution procedure

As can be seen easily, model (1) decomposes by week. Hence, the decomposition determines the total number of staff for each week in the planning horizon separately, which is 1 year in our case. Note that the results obtained by solving any week have no impact on adjacent weeks. Recall that on weekends there is no demand. To solve the weekly problems, we use a column generation based heuristic to obtain integer solutions [16]. Normally, column generation is used to find the optimum for a linear program (LP) or to solve the LP-relaxations in a search tree to find the optimum for an integer program (IP). The latter solution methodology is common, which is known as branch-and-price [39]. We use the column generation procedure since CPLEX was not able to solve the weekly subproblems to optimality within several hours and the lower bounds provided by CPLEX were really bad. A well known result from literature is that column generation provides a lower bound that is at least as good as the lower bound from the LP-relaxation. In many cases when the subproblems are general IPs the lower bound is often better than the lower bound from the LP-relaxation. This result also holds for our work. Column generation is described as follows. The general IP is decomposed into a master problem (MP) that provides a coordination structure and one or more subproblems (SPs) that are used as column generators. The methodology iteratively adds new columns to MP. The search for new columns is guided by the dual solution of MP in a separated optimization task via the subproblem(s). Compared to the simplex, this method is called pricing out [38]. In other words, to prove LP optimality one has to show that there exists no non-basis variable (column) with negative reduced cost in a minimization context – as in our case. In contrast to the simplex that evaluates all non-basis variables separately to choose the next entering variable, column generation searches the next entering variable via a separated optimization problem that defines the solution space of all non-basis columns. Consequently, column generation considers only a subset of feasible columns at a time since the number of feasible schedules is extremely large. Sometimes, the subproblems have a special structure that allows finding the best promising column efficiently – the column with the most negative reduced cost. However, the subproblems might be general IPs which are hard to solve – as in our case. But in this case, it is well known that the lower bound (LB) should be at least as tight as the LB given by the LP-relaxation of the original IP [39]. When column generation terminates and the solution is not integer then branching is necessary to find the optimal IP solution. Using some heuristics it is possible to transform the continuous solution into an integer solution. But then it is not guaranteed to find the optimal IP solution. In our computations, we found out that it is sufficient to solve MP as an IP when column generation terminates to find satisfactory solutions for the underlying problem. As we will see, in most cases we could prove optimality since the LB provided by column generation is equal or close to the upper bound (UB) provided by solving MP as an IP. A general flow chart of our column generation based heuristic is shown in Fig. 1.

Fig. 1
figure 1

Flow chart of the column generation heuristic

In the following, we state the master problem and derive the subproblem(s).

To create the master problem we separate the demand constraints (1i) from the original formulation (1). We use a set-covering formulation. The remaining constraints define the solution space of a feasible column and are used to state the subproblem(s). As previously mentioned, MP does not consist of all possible columns so it might be possible that MP is infeasible in the course of the algorithm. Assume the case we start MP with no columns. Therefore, we introduce auxiliary variables \( x_{{ep}}^{{out}} \) that denote the number of outside resources hired in period p for experience level e. To state MP, additional notation is used. Note that we drop the index w due to the weekly decomposition. Additionally, the index i can be omitted since physicians with the same experience level are summarized in one subproblem. This eliminates symmetry in the solution space [27].

Set with index

S(e):

set of schedules associated with experience e physicians (index s)

Parameters

c out :

cost per hour for outside resources

\( X_{{ep}}^s \) :

1 if schedule s for experience level e covers demand in period p, 0 otherwise

Integer decision variables

λ es :

number of physicians with experience e working schedule s

\( x_{{ep}}^{{out}} \) :

number of outside physician periods to cover demand with experience e in period p

Master Problem (MP)

$$ {\text{Minimize}}\,\sum\limits_{{e \in E}} {\sum\limits_{{s \in S(e)}} {{{\rm c}_e}{\lambda_{{es}}}} } + \sum\limits_{{e \in E}} {\sum\limits_{{p \in P}} {{c^{{out}}}x_{{ep}}^{{out}}} } $$
(2a)

subject to

$$ \begin{array}{*{20}{c}} {\sum\limits_{{\varepsilon = 1}}^e {\sum\limits_{{s \in S\left( \varepsilon \right)}} {X_{{\varepsilon p}}^s{\lambda_{{\varepsilon s}}}} } + x_{{ep}}^{{out}} \geqslant \sum\limits_{{\varepsilon = 1}}^e {{d_{{\varepsilon p}}}} } & {\forall \,e\, \in \,E,\,p\, \in \,P} \\ \end{array} $$
(2b)
$$ \begin{array}{*{20}{c}} {{\lambda_{{es}}},x_{{ep}}^{{out}} \geqslant 0\,{\text{and}}\,{\text{integer}}} & {\forall \,e\, \in \,E,\,s\, \in \,S(e)} \\ \end{array}, p\, \in P $$
(2c)

The objective function (2a) minimizes the staffing cost of hiring physicians for each experience level and the cost incurred when there are gaps in coverage. Again, the option to assign outside physicians is a modeling device that assures feasibility in the course of the column generation procedure. Therefore, we choose a high cost value for this option, i.e. c e c outeE.

Constraints (2b) ensure that the total number of physicians with experience e in each demand period p is sufficient to cover demand that needs experience level e or higher. The general integer variable λ es denotes the number of physicians with experience e that work according to schedule sS(e). Again, the variable \( x_{{ep}}^{{out}} \) guarantees feasibility. Variable definitions are given in (2c). The integrality conditions on \( x_{{ep}}^{{out}} \) can be relaxed because they will always be integer when variables λ es are integer.

As previously mentioned, in each iteration we look for a negative reduced cost column. To evaluate the reduced cost of a (new/promising) column, we need the dual information corresponding to an optimal solution of the relaxed MP. Note, all columns contained in MP at this stage have non-negative reduced cost, i.e. dual feasibility of MP is achieved. Let δ ep ≥ 0 be the dual value of the demand constraints (2b) in period p for experience level e. Then the reduced cost of each column in MP can be defined as follows.

$$ {\bar{c}_{{es}}} = {c_e} - \sum\limits_{{p \in P}} {\left( {\sum\limits_{{\varepsilon = e}}^{{\left| E \right|}} {{\delta_{{\varepsilon p}}}} } \right)X_{{ep}}^s} $$
(3)

In order to verify that no columns exist such that \( {\bar{c}_{{es}}} \) < 0 for all eE, sS(e) which have not been already contained in MP, we minimize the expression \( {\bar{c}_{{es}}} \) over the constraint region for SP associated with experience level e. We drop the index s in (3) and replace all parameters with SP decision variables to derive the objective function (4) of the subproblem(s).

$$ {\bar{c}_e} = {c_e} - \sum\limits_{{p \in P}} {\left( {\sum\limits_{{\varepsilon = e}}^{{\left| E \right|}} {{\delta_{{\varepsilon p}}}} } \right){x_{{ep}}}} $$
(4)

The following additional notation is used in the developments.

Binary decision variables

x ep :

1 if physician with experience e is on duty in period p, 0 otherwise

\( y_{{ep}}^{{shift}} \) :

1 if physician with experience e begins a shift in period p, 0 otherwise

Subproblems (SPs)

$$ {\text{Minimize}}\,\,{c_e} - \sum\limits_{{p \in P}} {\left( {\sum\limits_{{\varepsilon = e}}^{{\left| E \right|}} {{\delta_{{\varepsilon p}}}} } \right){x_{{ep}}}} $$
(4a)

subject to

$$ \begin{gathered} \hfill \\ \begin{array}{*{20}{c}} {y_{{ep}}^{{shift}} = {x_{{ep}}}\left( {1 - {x_{{ep - 1}}}} \right)} & {\forall e \in E,p} \\ \end{array} \in P \hfill \\ \end{gathered} $$
(4b)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p}}^{{p + \underline P_e^{{shift}} - 1}} {{x_{{e\rho }}}} \geqslant \underline P_e^{{shift}}y_{{ep}}^{{shift}}} & {\forall e \in E,p \in \left\{ {1, \ldots, \left| P \right| - \underline P_e^{{shift}}} \right\}} \\ \end{array} $$
(4c)
$$ \begin{gathered} \hfill \\ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p}}^{{\left| P \right|}} {{x_{{e\rho }}}} \geqslant \left( {\left| P \right|-{ }p{ } + { 1}} \right)y_{{ep}}^{{shift}}} & {\forall e \in E,p \in \left\{ {\left| P \right| - \underline P_e^{{shift}} + 1, \ldots, \left| P \right|} \right\}} \\ \end{array} \hfill \\ \end{gathered} $$
(4d)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p + \underline P_e^{{shift}}}}^{{p + \overline P_e^{{shift}}}} {\left( {1 - {x_{{e\rho }}}} \right)} \geqslant y_{{ep}}^{{shift}}} & {\forall e \in E,p \in } \\ \end{array} \left\{ {1, \ldots, \left| P \right| - \bar{P}_e^{{shift}}} \right\} $$
(4e)
$$ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = 1}}^{{p - 1}} {\left( {1 - {x_{{e\rho }}}} \right)} \geqslant \left( {p - {1}} \right)y_{{ep}}^{{shift}}} & {\forall e \in E,p \in } \\ \end{array} \left\{ {1, \ldots, \underline P_e^{{rest}}} \right\} $$
(4f)
$$ \begin{gathered} \hfill \\ \begin{array}{*{20}{c}} {\sum\limits_{{\rho = p - \underline P_e^{{rest}}}}^{{p - 1}} {\left( {1 - {x_{{e\rho }}}} \right)} \geqslant \underline P_e^{{rest}}y_{{ep}}^{{shift}}} & {\forall e \in E,p \in \left\{ {\underline P_e^{{rest}} + 1, \ldots, \left| P \right|} \right\}} \\ \end{array} \hfill \\ \end{gathered} $$
(4g)
$$ \sum\limits_{{p \in P}} {{x_{{ep}}}} \leqslant {r_e} $$
(4h)
$$ \begin{array}{*{20}{c}} {{x_{{ep}}},y_{{ep}}^{{shift}} \in \left\{ {0,1} \right\}} & {\forall e \in E,p \in P \cup \left\{ 0 \right\}} \\ \end{array} $$
(4i)

The objective function (4a) minimizes the reduced cost of a new promising column. Constraints (4b) to (4i) are similar to (1b) to (1h) and (1j). These constraints link the x-variables with the y-variables, enforce a minimum and maximum shift length as well as a minimum rest period before a shift starts, limit the working periods, and give variable definitions.

5 Experimental study

In this section we investigate the performance of the model. Real-world data supplied by MRI from 2005 is used in the computations. We solve the model for each week in 2005 to determine optimal staff composition. The period length is set to one hour. So, we have 120 periods for each weekly subproblem which corresponds to five working days (Monday through Friday) each 24 h long. According to the hospitals practice, we assume two experience levels high (specialists) and low (residents). Both groups work according to the same working agreements that state a minimum shift length of 6 h, a maximum shift length of 12 h, and a minimum rest period of 12 h before (after) a shift starts (ends) as well. Furthermore, neither a specialist nor a resident can be assigned to more than 42 working hours per week. We use relative cost values rather than real cost to simplify the computation. This is permissible because an appropriate scaling gives the real cost and the physicians in each experience level are assumed to be identical. The cost for a specialist is 1.5 times the cost for a resident. Input data for the physicians is given in Table 1.

Table 1 Parameter for physicians

MRI states that at least 25% of total demand has to be covered by specialists because of training purposes. So, we can derive the demand for specialists and residents using the total demand per hour in 2005.

The demand profile is shown in Fig. 2. The x-axis gives the number of physicians required in each period of the week whereas the y-axis denotes the periods in each week – 120 periods in our case. All 52 weekly profiles are put on top of each other. Dark area shows that the demand for physicians in that period is constant over the year. However, lighter area shows varying demand from week to week. Finally, we have to prevent the usage of outside resources in any final solution. To assure that no demand is covered by outside recourses we set the cost to 1,000, i.e. c out = 1,000.

Fig. 2
figure 2

Total demand profile in 2005

Fig. 3
figure 3

Total number of physicians in each week in 2005

All computations are performed on a 3 GHz PC (Intel® Pentium® 4 CPU) with 2 GB RAM running under the Windows XP Professional operating system. The column generation algorithm is coded in ILOG OPL Studio 5.2 and CPLEX 10.2 is used to solve the LP-relaxation of MP and the IP subproblems as well as to solve MP as IP after column generation terminates. The default settings of CPLEX are used. In the column generation procedure we look for the most promising column. In other words, we solve the IP subproblems to optimality. When we transform the final LP solution to an IP solution by solving MP as IP then we solve the IPs to optimality as well.

Results for the 52 weeks in 2005 are reported in Tables 2 and 3. Column 1 gives the week number whereas columns 2 and 3 give the demand hours for each experience level that have to be covered in the corresponding week. The next four columns give lower bounds on the workforce size. Each LB gives the minimum workforce size that has experience level e or higher to cover total demand up to experience level e. The latter is based on the peak demand period. It is assumed that the number of physicians is at least as big as in the period with the highest demand.

Table 2 Computational results for weeks 1–25 (r e  = 42, ∀ eE)
Table 3 Computational results for weeks 26–52 (r e  = 42, ∀ eE)

The former is often called the total working time bound. In other words, the number of available working hours has to be sufficient to cover the demand hours associated with each experience level. The bounds are calculated as follows.

$$ \begin{gathered} \hfill \\ \begin{array}{*{20}{c}} {{\text{LB\_Work}}(e) = \frac{{\sum\limits_{{\varepsilon = 1}}^e {\sum\limits_{{p \in P}} {{d_{{\varepsilon p}}}} } - \sum\limits_{{\varepsilon = 1}}^e {{\text{LB\_Work}}\left( {\varepsilon - 1} \right){r_{{\varepsilon - 1}}}} }}{{{r_e}}} + {\text{LB\_Work}}\left( {e - 1} \right)} & {\forall e \in E} \\ \end{array} \hfill \\ \end{gathered} $$
(5)
$$ {\text{LB\_Max\_Dem}}(e) = \mathop{{{ \max }}}\limits_{{p \in P}} \left( {\sum\limits_{{\varepsilon = 1}}^e {{d_{{\varepsilon p}}}} } \right)\forall e \in E $$
(6)

Columns 8, 9, and 10 give the total time spent as well as the computation time for the subproblems and the time used to transform a continuous solution into an integer solution when solving MP as IP. The times spent on the subproblems account for the main computational burden in the course of the algorithm. This is not surprising since the subproblems are general IPs. However, the time used to generate an integer solution is remarkably small, i.e. in all instances it is around one second. The remaining columns give some solution statistics. For instance, column 11 gives the best LB generated by the column generation procedure whereas column 12 states the best UB achieved by solving the final MP as IP. The development of the LBs during the course of the column generation procedure is given in Fig. 4 in the Appendix. The well known tailing off effect can be seen. The next column provides the absolute gap. As can be seen in 42 out of 52 instances, the algorithm is able to find the optimal solution. In the remaining 10 instances, the final integer solution is close to the optimal solution.

Fig. 4
figure 4

Development of LBs for weeks 1–52 (r e  = 42, ∀ eE)

The next three columns give the number of physicians which consists of the number of specialists plus the number of residents. Columns 17 to 19 give the available supply hours, the finally used supply hours, and the idle time in the schedule. Idle time is defined as the time that is assigned in the schedule, however it is not necessary to cover the demand hours (idle hours = used hours – demand hours). On average 1,571.77 supply hours are available to cover demand. However, the algorithm uses just 1,393.12 h which is 88.63%. This is because the lower bounds on the workforce size are determined by the maximum demand period (cf. columns ‘LB_Max_Dem(Spec)’ and ‘LB_Max_Dem(Res)’). The last two columns report the number of columns generated and the number of iterations necessary to solve the LP-relaxation of MP. The algorithm generates on average 53.3 columns and needs on average 29 iterations to find the LP-relaxation. Recall that we start the algorithm with no columns included in MP.

Figure 3 shows the total number of physicians in each week. The gray part of each bar corresponds to the number of residents whereas the black part gives the number of specialists. As can be seen, the number of specialists is almost stable for all 52 instances. Except for weeks 1 and 52 where less specialists are necessary. An explanation for this situation is the lesser demand in that period of time. On the other side, the number of residents varies form week to week. From a hospital management point of view, the outputs show that on average approximately 10 specialists are necessary throughout the year. Furthermore, the number of assigned residents is on average 27.17 with a minimum of 12 and a maximum of 33.

In total we can conclude that the proposed algorithm works fine when the lower bound is determined by the maximum demand periods for each experience level. In such cases, there exist much more available supply hours than demand hours.

In the next set of experiments, we reduce the working hours per week from 42 to 30. We then apply the algorithm again to weeks 2–6. All other parameters remain unchanged. But we set a time limit of 60 s when we solve MP as IP. The same output statistics are provided in Table 4. As can be seen now, the LBs determined by the working hours (cf. columns ‘LB_Work(Spec)’ and ‘ LB_Work(Res)’) are bigger than the LBs based on the maximum demand periods. The computation time to solve the instances with less working time increases approximately by a factor of 10 compared to time presented in the previous experiments. Again the time spent on the solution of the subproblems accounts for the main computational burden. The lower bounds are fractional solutions instead of integer solutions in the previous experiments. The best LBs (see column 11) can be improved by rounding up to the next integer. We use the improved lower bounds to calculate the absolute gap between the best upper and lower bound (cf. column 13). The algorithm is not able to prove optimality for any instance in the second set of experiments. However, the gaps indicate that high quality solutions are found. To prove optimality some branching is necessary. Furthermore, the number of hired physicians increases which is not surprising. Additionally, the available supply hours are used in a more efficient way. Only instance 4 in Table 4 does not use all available hours, i.e. 1 h is not used. The idle time ranges from 39 to 58 h. Also, this indicates a more efficient usage of resources in the scheduling. Finally, the number of generated columns increases considerable as well as the number of iterations to prove LP-optimality.

Table 4 Computational results for weeks 2–6 (r e  = 30, ∀ eE)

Based on the second set of experiments, we conclude that if the available working hours determine the LBs on the optimal workforce size then the algorithm needs more computation time and generates more columns. However, solution quality is still very high and the algorithm is capable to solve the problems. Such an analysis gives hospital management the opportunity to tare working contracts with its staff. This is most valuable when staffing decisions has to be made or when a general staffing policy is developed. In such cases, the algorithm allows to evaluate different philosophies under consideration by hospital management.

6 Conclusion

In this paper, we have presented a long-term staffing model for the flexible shift scheduling problem of physicians with different experience levels. The model treats shifts implicitly when constructing a roster which allows more flexibility in the scheduling process. Additionally, it guarantees appropriate staffing ratios according to hospital policies. Furthermore, we have introduced a column generation based heuristic to solve the underlying problem. The output consists of staffing levels for physician groups with different experience levels. Using real-world data, computational investigations show the efficiency of the proposed algorithm in finding (near) optimal solutions when two experience levels, i.e. residents and specialists, are considered. Hospital management benefits from the information provided by the model when staffing decisions have to be made. As has been shown, the main burden on computational runtime is spent in solving the subproblems. Further research should be aimed in developing a better representation of the subproblems including customized procedures to solve them more efficiently. Another starting point for research might be the feasibility heuristic which at the moment is not really sophisticated. Rounding procedures coupled with metaheuristics have proven to be the most effective way. Finally, if one is interested in exact solutions then the column generation subroutine should be embedded in a branch-and-bound framework.