1 Introduction

There has been limited attention given to doctor interruptions in prior studies on appointment scheduling. Interruptions or “gap times” refer to disruptions or delays in the provision of service. In an outpatient facility, interruptions may involve phone calls from other doctors or pharmacists, writing up charts/notes, and dealing with staff. In this study a delay is not considered an interruption if it relates to the client the doctor is currently scheduled to serve (e.g., if the doctor is consulting the current patient’s chart). Interruptions are typically modeled as non-preemptive, that is, occurring between patient appointments (Cayirli and Veral 2003). The question arises as to whether interruptions have an impact on the design of an optimal appointment schedule (AS). In this paper, a simulation optimization approach is proposed to determine how the “optimal” schedule changes with increasing levels of interruptions and whether interruptions should be accounted for when developing a schedule.

The appointment scheduling problem is an exceedingly difficult problem, both from a practitioner and research perspective, because of the significant uncertainty in the system (e.g., stochastic arrival and service times, cancellations, and no-shows). Prior research has primarily used analytical methods or simulation to analyze this problem. Analytical methods include queuing theory (Brahimi and Worthington 1991; Pegden and Rosenshine 1990; Stein and Côté 1994; Jansson 1966; Mercer 1960), nonlinear programming (Robinson and Chen 2003), stochastic linear programming (Denton and Gupta 2003), and dynamic programming (Fries and Marathe 1981; Liu and Liu 1998). Analytical methods generally have difficulty capturing the complexity of the system and the diversity in environmental variables, and are often restrictive (e.g., in terms of service time distributions and clinic size) in order to make the models tractable (Cayirli and Veral 2003). Simulation, on the other hand, is able to account for more complex settings but is unable to search for an optimum. Simulation studies have analyzed environmental factors such as patient and doctor unpunctuality (Blanco et al. 1964; Fetter and Thompson 1966; Vissers and Wijngaard 1979; Vissers 1979), patient heterogeneity (Klassen and Rohleder 1996; Rohleder and Klassen 2000; Cayirli et al. 2006, 2008), and scheduling over a time horizon of two or more weeks (Klassen and Rohleder 2004). Ho and Lau (1992, 1999) also use simulation to evaluate various scheduling rules and provide a heuristic for choosing a best rule under specific environmental conditions.

There have been a few studies that have considered doctor interruptions. Rising et al. (1973) find that interruptions such as consultations with other physicians, emergency phone calls, and extended breaks result in time lost to the clinic and thus, patient waiting times were sensitive to this factor. Interruptions can disrupt the system since a delay in the start of an appointment has the potential to impact the start of all subsequent appointment slots. Lehaney et al. (1999) allow for interim periods or gap times in their model to be completed before the next patient is seen. They show that a schedule which front-loads the system with relatively shorter appointments results in larger queues forming early in the session and tapering off toward the end. This reduces the mean waiting time of patients. The objective of this paper is to determine the degree of sensitivity of AS performance to various interruption levels. Data collected in outpatient clinics is used as a basis for the simulation optimization model.

Simulation optimization is a stochastic optimization method that enables a search for solutions in problems where some or all of the system parameters are stochastic (Fu 2002; Law and Kelton 2000). It is well suited for problems where uncertain parameters can be represented by probability distribution functions. In this paper, simulation is combined with a metaheuristic technique which allows for the exploration of large complex search domains. The combination of simulation and metaheuristics such as scatter search, simulated annealing, or genetic algorithms has been applied in other problem environments (e.g., manufacturing systems design (Azadivar and Tompkins 1999; Teleb and Azadivar 1994; Pierreval and Tautou 1997), single server queuing systems (Fu 2002), inventory control models (Fu 2002; Lopez-Garcia and Posada-Bolivar 1999), and environmental policy planning (Linton et al. 2002; Huang et al. 2005)).

The paper is organized as follows. Section 2 develops the simulation optimization model and experimental design. Section 3 presents the results. Section 4 concludes with a discussion of the managerial implications for appointment system design.

2 Model development

2.1 Simulation optimization algorithm

The problem formulation for simulation optimization algorithms specifies the objective function and constraints as a set of discrete-event simulation models in which a heuristic guides the search for an optimum. This method generates solutions to a problem by iteratively generating sets of decision variable values for evaluation by a simulation module. This method can significantly decrease the time and cost of finding a (near) optimal solution since it performs an intelligent systematic search of the solution space. This precludes the need to exhaustively explore all possible configurations as when simulation is used alone.

The simulation optimization approach used in this study combines simulation with a scatter search heuristic. Arena (Kelton et al. 2007) is used for the simulation, while OptQuest (2007) is used as the heuristic search engine. Scatter search is an evolutionary method that has shown great potential for solving difficult combinatorial and nonlinear optimization problems with integer decision variables (Marti et al. 2006; Glover 1994). Secondary heuristics based on tabu search and neural networks (OptQuest 2007) are also employed to guide the search for an optimum. An overview of the algorithm is given in Table 1.

Table 1 Simulation optimization algorithm

A distinguishing feature of scatter search heuristics is a mapping mechanism used to translate points that are not feasible into a feasible point rather than removing it from consideration. Furthermore, since the search mechanism is population based, it permits the algorithm to simultaneously search many areas of the search domain. Tabu search is used both to ensure diversity in the population and to prevent the algorithm from revisiting solutions that have proven to be sub-optimal.

The probabilistic search techniques used means the quality of the solutions generated can be highly variable (Andradόttir 2002). Although the scatter search and Tabu search heuristics are able to offset this effect by performing an intelligent systematic search of the solution space, the probabilistic search mechanism results in experiments producing different results. In Klassen and Yoogalingam (2009) the robustness of the simulation optimization approach used in this study was validated by testing its performance against prior analytical studies. The simulation optimization algorithm was able to find solutions that were better for a problem test bed containing stochastic elements. Furthermore, when compared to rules found in prior simulation studies, the algorithm was able to find superior solutions.

2.2 Problem formulation

The appointment system considered in this study consists of a single server where clients are served according to a predetermined schedule. The problem facing the decision maker is one of allocating appointment times to N patients over a finite time horizon T 0 , or operating session, such that the combined costs of patient waiting and physician idle time are minimized. The notation used is given in Table 2.

Table 2 Model notation

It is assumed that patients and the doctor arrive punctually (neither late nor early). Given that x 1=t 1, CWT 1=0, and IT 1=0, it follows that:

$$t_{i} = t_{{i - 1}} + x_{i} ,\quad i = 2, \ldots ,N$$
(1)
$$CWT_{i} = \max {\left\{ {t_{{i - 1}} + CWT_{{i - 1}} + s_{{i - 1}} - t_{i} ,0} \right\}}{\text{,}}\quad i = 2, \ldots ,N$$
(2)
$$IT_{i} = \max {\left\{ {t_{i} - {\left( {t_{{i - 1}} + CWT_{i} + s_{{i - 1}} } \right)},0} \right\}},\quad i = 2, \ldots ,N$$
(3)
$$DET = \max \left\{ {t_N ,t_{N - 1} + CWT_{N - 1} + s_{N - 1} } \right\} + s_N $$
(4)

A general formulation for the problem is given by:

$$\min \,WIT = \;c_w E\left( {\sum\limits_{i = 2}^N {CWT_i } } \right) + c_{it} E\left( {\sum\limits_{i = 2}^N {IT_i } } \right)$$
(5)
$${\text{s}}{\text{.t}}{\text{.}}\quad 0 \leqslant x_{i} \leqslant l^{u}_{i} \quad \forall i$$
(6)
$$x_i \,\,{\text{integer}}$$
(7)

The parameter \(l_i^u \) is the upper bound for each appointment slot. Appointment lengths are constrained to be integers to add realism. Data collected in outpatient clinics showed that interruptions during appointments are usually related to some aspect of the current patient’s service and thus are not considered a “true” interruption. Thus, interruptions are assumed to occur only between patient appointments with some probability δ and the length of the interruptions follows an exponential distribution with a mean of θ. In other words, the doctor will finish with a patient before proceeding to deal with the interruption. It is assumed that all appointments will be seen and the physician will complete service for all scheduled patients regardless of interruptions. As in some prior simulation studies (e.g., Klassen and Rohleder 2004; Ho and Lau 1992), the cost parameters c w and c it are set to one for this study.

2.3 Experimental design

To study the effects of doctor interruptions on appointment system design, data was collected by observation in three outpatient clinics in Hamilton, Ontario, Canada. The observed clinics had one doctor each with varying numbers of appointments scheduled in varying session lengths. Interviews were carried out and time studies were done. The aim was to gain understanding regarding planned appointment times, actual service times, doctor lateness, and doctor interruptions. The data was collected for eight clinic sessions; four mornings and four afternoons. Observations recorded include patient arrival times, appointment times, service start and end times, and interruptions to the service. Interruptions are defined as a break in service that is unrelated to the current patient. The data shows that interruptions between appointments are relatively common, occurring for 22.65% of the appointments. This can impact the schedule since they delay the start of the next appointment. A summary of the data is provided in Table 3. The purpose of the data is to help establish validity of the input parameters used in the study. Thus, this study does not provide a case analysis of any one clinic, rather using the data to provide a realistic representation of the stochastic parameters of the model.

Table 3 Observed clinic environments

An analysis of the data suggests some interesting issues in regard to how interruptions should be modeled. It is apparent that interruptions can be modeled by “adding” interruptions to the doctor’s regular work, thus increasing the total amount of work. The case where interruptions add work is modeled and the best AS are presented. However, some of the data suggests that this may not be accurate for all doctors. Some of the sessions had fairly high levels of interruptions, but the doctor worked only a minimal amount of overtime. One 210 min session had 24.5 min of interruptions, but the doctor only worked 5 min overtime. A 90 min session had 3 min of interruptions, but the doctor was finished 7 min early. This suggests that some doctors develop their schedules knowing they will be interrupted, and allow for this by scheduling their total patient consultation time to be less than the session length. On the other hand, this may suggest that some doctors adjust their behavior in real time based on the level of interruptions they are receiving. This second case, where interruptions occur but do not add work is also modeled in order to determine the best AS. In addition, by comparing the two scenarios, insights are gained regarding how important it is to model interruptions.

Lognormal service times are used. Based on Chi-square and Kolmogorov-Smirnov tests (α = 0.05), the lognormal distribution is a good fit for the data. Lognormal times have also been found empirically (e.g., Cayirli et al. 2006, 2008; Klassen and Rohleder 1996; O’Keefe 1985). Service times are distributed as follows:

$$s_i \sim Lognormal\,\left( {\mu _N ,\sigma _N } \right)$$
(8)

The mean (µ N ) is the length of the session divided by the number of appointments scheduled per session, N. Outpatient clinics typically report session lengths between 180 min and 240 min (Klassen and Rohleder 1996). The length of the session in this study is an average of the two at 210 min. The size of the clinic is set at 14 appointments that are 15 min each on average. The standard deviation is defined as \(\sigma _N \sim Lognormal\left( {\alpha _N ,\beta _N } \right)\) where α N and β N are set to obtain a specific coefficient of variation (CV). The CV captures the variability in service times and Cayirli and Veral (2003) report CV levels that typically fall between 0.35 and 0.85. Three levels (0.35, 0.60, 0.85) are used in this study to determine if this factor combined with interruptions has a major impact on the best AS. Thus, the values used for N = 14 and CV = 0.6 are as follows: µ N  = 15 min, α N  = 7.5, and β N  = 7.5.

Table 3 shows interruptions varying from 8.3% to 50%. Consequently, interruption levels are tested at 0%, 10%, 20%, 30%, 40%, and 50%. The length of each interruption was fitted from the empirical data in Table 3 and follows an exponential distribution with a mean of 5 min.

3 Results and analysis

3.1 Case 1: Interruptions add work

As discussed above, in some clinic settings interruptions result in additional work for the doctor which is an intuitive way to account for interruptions. In this case, the doctor has an average of 210 min of work serving patients and interruptions add more work to this. The best AS with their objective function values are shown for all 18 cases in “Appendix A”. Table 4 provides the percentage change in some relevant measures as the interruption level increases.

Table 4 Performance metrics (CV = 0.60)

Table 4 and “Appendix A” show that costs increase as interruptions rise. For example, for CV = 0.60 the mean total cost of waiting and idle times increases by approximately 9% on average for each increase in the level of interruptions. Results also show higher costs for larger values of the CV. Thus, performance of the AS decreases as variability in service times increases. A plot of the best AS for two representative interruption levels for CV = 0.60 is given in Fig. 1. The vertical lines show when the appointment slots would be if fixed-interval, 15 min appointment slots were used.

Fig. 1
figure 1

Appointment intervals for the best AS—Interruptions add work (CV = 0.6)

All AS demonstrate shorter appointment slots at the beginning and end of the session, with fairly constant appointment slot lengths in the middle portion. Thus, clients are bunched at the beginning and end of the session. However, “Appendix A” and Fig. 1 show differences in the AS as the level of interruptions increases. On average, appointment slots are spread out more in the middle portion as interruptions increase. Figure 1 shows a higher middle portion and “Appendix A” has longer slots in the middle portion as well as shorter slots at the very end of the session. Average slot length in the middle portion (appointment slots 3–11) for 0–50% interruptions is 17.37 min, 17.74 min, 18.22 min, 18.30 min, 18.70 min, and 18.96 min, respectively. Also, for many AS, the time between the last two appointments is zero, especially as the level of interruptions increases. This suggests that for these cases a double booking strategy for the last two appointments produces the best AS.

Given that this is a first study using empirical data for the duration of interruptions and the relatively small size of the sample used in this study, experiments were run to test how the best AS changes with the duration of interruptions. Experiments were run with the duration for interruptions following an exponential distribution with means of 2.5 min and 7.5 min. Results for the 20% and 40% interruption levels are presented in Table 5.

Table 5 Performance metrics for duration of interruptions (CV = 0.6)

The results show the performance of the AS decreases as the mean interruption duration increases and improves as the duration decreases. This is not unexpected and is consistent across varying levels of interruptions. The pattern of appointments remains the same, with shorter appointment slots at the beginning and end of the session and appointments spread out fairly evenly in the middle of the session.

3.2 Case 2: Interruptions do not add work

To model the second case, interruption frequencies and lengths remained the same as in the first case. In order to capture the case where the doctor is aware that interruptions are likely to occur and either develops a schedule with this in mind or adjusts in real time, service times were decreased such that the total amount of work (210 min) remained the same regardless of level of interruptions. For instance, at the 10% interruption level for N = 14 there will be an average of 1.4 interruptions, each lasting 5 min. This represents 7 min of lost time during the session or an average of half a minute per patient; thus, µ N  = 14.5 for this interruption level. A plot of the best AS for two representative levels of interruptions and a comparison of client waiting times for each scenario are given in Figs. 2 and 3.

Fig. 2
figure 2

Appointment intervals for the best AS—Interruptions do not add work (CV = 0.60)

Fig. 3
figure 3

Comparison of client waiting times (CV = 0.6)

Results show that the best AS are very similar across interruption levels. In fact, even the height of the middle portion remains virtually the same regardless of level of interruption, and there is no additional bunching at the end as interruption levels increase. In addition, Fig. 3 shows that waiting times are fairly constant as the interruption level increases when the total amount of work remains the same. This is a positive result (that was initially observed in the empirical data), suggesting that it is possible to manage waiting times despite high levels of interruptions.

3.3 Plateau-dome scheduling rule

One of the goals in appointment scheduling is to design actionable rules that are easy to implement from a practitioner standpoint. A major finding in prior studies has been that a “dome” scheduling rule maximizes performance of the system (Denton and Gupta 2003; Robinson and Chen 2003; Wang 1993, 1997). The dome rule sets appointment intervals such that they are gradually increasing toward the middle of the session then gradually decreasing toward the end of the session. However, an AS where intermediate appointment intervals are similar may be easier to implement. Results in this study demonstrate a flatter version of the dome pattern, with the appointment slots in the middle portion having similar lengths. This result is likely attributable both to the integer constraint and to the longer sessions modeled here. In order to explore this concept further, each experiment was rerun to determine the impact of adding a constraint that forced the middle slots to be equal. The addition of this constraint resulted in less than a 0.5% loss in performance on average. The results for the scenario where interruptions do not add work are shown in Fig. 4. We denote this pattern a “plateau-dome” (Klassen and Yoogalingam 2009).

Fig. 4
figure 4

Plateau-dome scheduling rule: interruptions do not add work

Figure 4 shows that the height of the plateau is the same for interruption levels from 10% to 50% (18 min slots), reinforcing the findings in Section 3.2.

In revisiting the case where interruptions add work, adding the constraint results in plateaus that become gradually higher as interruptions increase. This is consistent with the finding in Section 3.1 that the average slot length on the middle portion increases accordingly. However, sometimes the length of the plateau is shorter. Appendix A shows results without the additional constraint, but it demonstrates this concept. It shows that as the level of interruptions increases, the plateau becomes relatively shorter and higher and a standard dome pattern becomes more prevalent. For instance, a number of the AS for the 40% and 50% interruption levels follow a pattern that includes slot lengths of 0 min, 13 min, 18 min, 19 min, 20 min, 19 min, 18 min, 16 min, 12 min, and 0 min, respectively. Thus, when interruptions add work, a traditional dome pattern may produce better performance for higher levels of interruptions, while a flat plateau-dome pattern may be better for lower levels of interruptions.

4 Conclusion

In this study a simulation optimization approach is used to develop optimal outpatient appointment schedules in the presence of interruptions. Empirical data from multiple outpatient clinics was collected and used to develop realistic settings for the input parameters of the model. The main findings are summarized as follows:

  • Performance decreases as the variability of service times (CV) increases.

  • All AS demonstrate a higher, flatter portion in the middle of the session which we have denoted a “plateau-dome”.

  • Interruptions can be modeled either by adding the time of the interruption to the workload of the doctor or by adjusting the doctor’s schedule such that the interruptions do not add work. This distinction has implications for practice.

  • For the case where interruptions add work:

  • ◦ Client waiting times and the day end time increase as interruptions increase. Doctor idle time is reduced.

  • ◦ The height of the plateau increases as interruptions increase.

  • ◦ A traditional dome pattern may produce better performance for higher levels of interruptions, while a flat, plateau-dome pattern may be better for lower levels of interruptions.

  • For the case where interruptions do not add work:

  • ◦ All performance measures and the AS (including the height of the plateau) remain very similar as interruptions increase.

Therefore, if a clinic situation is such that doctors are able to account for interruptions in the schedule or adjust their behavior so that the total amount of work is not increased, there is no need to adjust the AS as the level of interruptions increases (Case 2). If, however, the clinic situation does not allow for interruptions and they add work for the doctor, the AS should be adjusted based on the level of interruption experienced (Case 1).

Future research can account for different cost structures including the relative cost of doctor idle time and overtime. Other doctor related factors such as unpunctuality and patient unpunctuality may also have an impact on the choice of the optimal AS.