Keywords

1 Introduction

Transportation systems play a central role in modern societies, accounting for 5 % of GDP and employing around 10 million people in EU [1]. A significant share of governments’ budget is spent in public transports. Local transportation authorities face challenges such as the reduction of pollutant emissions [24], the satisfaction of an increased demand for quality and reliable public transport [5], and the design of networks with improved accessibility and mobility of citizens in urban areas [6].

Bus services have a significant impact on the life of citizens, mostly in areas with high population density [5, 7]. The demand for urban transports, particularly by bus, has increased considerably during the last decades, together with the growth of cities. This created the need for higher capacity and improved access to bus transports [6, 8]. Citizens expect bus transportation systems to satisfy their needs with quality, efficiency, low-cost, minimum travel time, punctuality and availability [9, 10].

The planning of bus transportation systems is a complex problem, normally addressed using optimization tools. This problem may consider the minimization of operating costs, waiting time of passengers, the total route length or inconvenience of delays and anticipations from schedule [11]. It can incorporate the dynamics of several features, such as the coordination of buses, the interaction of buses with passengers, constraints of traffic or unforeseen events causing instability [12]. Moreover, the planning of bus systems must consider the specificities of the geographic area to respond adequately to the specific needs, constraints, demand and topology of each case study [5, 13, 14]. A transportation system carefully planned leads to lower costs and increased clients’ satisfaction.

Bus schedules are normally built with optimization methods aiming to minimize cost, originating rigid schedules with few slack time [15]. However, the daily operations of bus systems are highly exposed to uncertainty, arriving from non-foreseen events such as traffic, accidents, weather conditions, road block, road work, strikes, accidents, bus breakdown and unexpected peak of demand [6, 1416]. Delays and disruptions impact negatively the quality of the transportation service, and often trigger expensive recovery actions [15, 16]. Non-foreseen events lead to the occurrence and propagation of delays, and sometimes to the disruption of the service. Delays can be classified into two categories: primary and secondary delays. A primary delay cannot be prevented and induces the later arrival of a service trip that has departed on time [16]. A secondary delay occurs when an already delayed bus trip leads to the delayed start of the next bus trip. This inheritance of delays is called delay propagation. Delay propagation results from dependencies of consecutive trips whose slack time is not sufficient to absorb preceding delays [15, 16]. Only secondary delays can be minimized by adjusting the bus schedule [17, 18].

The management of uncertainty in bus systems can be strengthened with techniques of robust planning (planning phase) and of dynamic re-planning (operational phase) [17]. This paper addresses the robust planning of bus schedules. The most common approach to address this issue is the incorporation of slack time in schedules to absorb delays. Although the allocation of slack time is an approach that can successfully improve the robustness of timetables, its implementation entails the increase of operational costs [16].

This work aims to prevent the occurrence of disruptions due to uncertainty by improving the robustness of bus schedules. A set of real observations is used to define the range in which uncertainty fluctuates, which is a standard approach in robust optimization [19]. The optimization model minimizes deviations of real data from the current schedule. The decision variables correspond to the set of slack time to be added to the mean travel time between two adjacent time control points. The restrictions impose that the total slack of the original schedule and the optimized schedule is the same, implying that the two schedules have identical operational costs.

The robustness of bus schedules is assessed with the quantification of the Average Delay (AD) and the Probability of Delay (PD) per Time Control Point (TCP). Analogously, the Average Anticipation (AA) and the Probability of Anticipation (PA) are also computed. These measures are compared between the optimized schedule and the current schedule, to evaluate the relative performance of the schedule proposed by the optimization model developed in this paper.

The model was applied to a real case study concerning a bus line operating in Porto, Portugal. The bus line under study operates with five coordinated shifts and with non-even headways. The dataset for this case study was obtained through GPS devices installed in all buses. The model was implemented using the CPLEX solver.

A comprehensive literature review on this topic is provided in Sect. 2. The optimization model is described in Sect. 3. The robustness indicators used to assess the schedules obtained with the model are detailed in Sect. 4. The case study is described in Sect. 5. The results obtained and their implications are presented in Sect. 6, and the main conclusions are drawn in Sect. 7.

2 Optimization of Robustness in Bus Transportation Systems

One of the main challenges faced by bus transportation systems is to manage uncertainty occurring at the operational stage, and to minimize its negative impacts (e.g., failure in complying with scheduled times, bunching of buses or the disruption of the service) [20]. An adequate approach to address this problem is the improvement of robustness in bus schedules at the planning phase. The objective is to develop schedules that will minimize negative impacts after the occurrence of uncertainty events at the operational phase. In this context, more robust schedules have an enhanced ability to absorb or minimize the propagation of deviations (delays and anticipations) from the timetable. The implementation of robust schedules improves the quality of the transportation service delivered to passengers, leading to a more reliable service.

However, the definition of robustness of transportation schedules is not consensual in literature [15, 21, 22]. There is a panoply of approaches designed to address the needs of specific case studies, leading to a diversity of robustness definitions and measurements. Whilst studies with applications to bus systems are still scarce [1517, 23], the literature devoted to railway systems is more advanced.

The robustness of a bus schedule can be defined as its ability to absorb secondary delays, considering penalties every time a Primary Delay causes the later start of the next bus trip (i.e., when the slack time is not able to compensate a delay). Naumann, Suhl and Kramkowski [16] applied this concept with the simulation of disruption scenarios designed with stochastic programming, and pursuing the minimization of total costs (i.e., planned costs plus the costs caused by disruptions). The study concluded that robust scheduling significantly reduces the total expected costs (plan and disruption costs) when compared with the sole minimization of planned costs or with the simple addition of fixed slack time between trips. This approach can be unfeasible in large scale problems.

Kramkowski, Kliewer and Meier [17] considers that bus schedules are robust when they are able to reduce the propagation of the effects of unforeseen events. [17] pursued this concept with a measure for delay-tolerance in bus schedules (including planned costs and costs of delays), to obtain schedules with optimized distribution of buffer times, improving the absorption of disruptions and preventing delay propagation. Similarly, [15] applied this concept for the integrated Vehicle Schedule Problem and Crew Schedule Problem, building a bus schedule with increased delay-tolerance by redistributing its buffer time and minimizing its costs.

A related concept considers that schedules are more robust when they return minimum time deviations from planned. This approach was explored by Yan et al. [23] with an optimization model returning the distribution of slack time that minimizes time deviations, including delays, anticipations and overall variability from schedule. This approach was illustrated using data obtained from a Monte Carlo simulation. Hora, Dias and Camanho [24] proposed an optimization model adapted from [23], which optimizes the slack time distribution in order to minimize both delays and anticipations from schedule, using data from historic records.

An associated concept is the “recoverable robustness”, when bus schedules are recoverable by limited means in all likely scenarios, proposed by Liebchen et al. [19].

Finally, it is important to clarify the differences between alternative robust optimization techniques, all of which can be used to address problems whose information is uncertain to some extent. This is the case of bus daily operations that are often unable to comply with the schedule planned. Strictly Robust Optimization returns the best feasible solution for any realization of uncertainty within a convex set [25, 26]. The formulation of strictly robust concepts for public transportation timetables is provided in [27]. A Strictly Robust Optimization model is deterministic, set-based and solved through traditional optimization concepts and algorithms [28]. Moreover, uncertainty is not considered to follow a specific probabilistic distribution, from where Strictly Robust Optimization diverges from Stochastic Optimization [28]. However, strictly robust solutions are often considered too conservative, as they imply a great loss of optimality in order to guarantee robustness [27]. As an alternative to Strictly Robust Optimization, Fischetti and Monaci [29] proposed the approach of Light Robustness Optimization, whose solutions comply with an established quality level whilst maximizing robustness [27]. Models of Light Robustness are solved with heuristic methods, which require the initial specification of parameters such as “robustness goal” and “maximum objective function deterioration” to be accepted in the model [29].

3 Optimization Model to Improve Robustness of Bus Schedules

This section presents a model aiming to improve the robustness of an existing bus schedule. This model is an adapted formulation of the model proposed in [23]. The model can be applied to a bus line integrating several bus stops, some of which are monitored by the bus operating company, called Time Control Points (TCPs). The optimization model considers the entire set of sequential TCPs to be covered during a daily bus shift. A bus schedule is optimized by minimizing the deviation of a set of real observations (k = 1,…,K) from the timetable planned. The decision variables of the optimization model shown in (1) are the values of slack time (stored in vector τ) to be allocated between two adjacent TCPs (TCP m and TCP m+1) in the schedule under construction. This model can be converted by standard techniques into an equivalent Mixed Integer Linear Programming (MILP) model [30].

The objective function quantifies, for all TCPs, the time deviations of a set of real observations from a given schedule under assessment. The objective function penalizes earlier arrivals at each TCP with weight γ 1 and later arrivals with weight γ 2.

This model has several features that differentiate it from the approach proposed by Yan, et al. [31]. It involves the optimization of non-cyclic bus schedules adopting a daily perspective. The component of the objective function associated with the minimization of the absolute value of deviations, included in [31], was removed from our formulation as the quantification of deviation variability was considered to have a high degree of redundancy with the other two components of the objective function. This term would also increase considerably the computation complexity of the optimization model (changing from quadratic complexity to cubic complexity). The total time of the daily trips (Ψ) is not allowed to change, in order to ensure that the original schedule and the new one returned by the optimization model have the same operational cost. Furthermore, it was imposed a minimum travel time between adjacent TCPs (N m,m+1), to ensure the feasibility of the schedule proposed.

Minimize:

$$ {\text{F(}}\tau ,\gamma_{1} ,\gamma_{2} ) { = }\frac{1}{K}\sum\nolimits_{m = 1}^{M} {\sum\nolimits_{k = 1}^{K} {\gamma_{1} \cdot max(0,SD_{m,k} ) + \gamma_{2} \cdot max( - SD_{m,k} ,0)} } $$
(1)

Subject to:

$$ SD_{1,k} = 0 $$
(1.1)
$$ SD_{m + 1,k} = E(T_{m,m + 1} ) + \tau_{m,m + 1} - T_{(m,m + 1),k} + (1 - \beta_{m,m + 1} )*SD_{m,k} $$
(1.2)
$$ E(T_{m,m + 1} ) + \tau_{m,m + 1} \ge N_{m,m + 1} $$
(1.3)
$$ \sum\nolimits_{m = 1}^{M - 1} {(E(T_{m,m + 1} ) + \tau_{m,m + 1} )} =\Psi $$
(1.4)

Where:

  • M∈ℤ: number of TCPs;

  • K∈ℤ: number of real observations of the system (i.e., days);

  • γ 1∈ℝ: weight for earlier arrivals, γ 1∈[0,1];

  • γ 2∈ℝ: weight for later arrivals, γ 2∈[0,1];

  • Ψ∈ℝ: total time for the daily trip (in time units);

  • τ = [τ m,m+1]∈ℝM − 1: vector of slack time to be allocated in each segment between two consecutive TCPs [m,m + 1] (decision variables - in time units);

  • β = [β m,m+1]∈ℝM − 1: vector of dimension (M − 1) storing the Adjustment Factor of Driver between each pair of adjacent TCPs (input data, percentage);

  • E ( T ) = [E(T m-1,m )]∈ℝM − 1: vector of dimension (M − 1) storing the expected value of Travel Time in each segment (in time units);

  • N = [N m,m+1]∈ℝM − 1: vector of dimension (M − 1) storing the minimum travel time needed by a vehicle to travel between two adjacent TCPs (input data, in time units);

  • T = [T (m,m+1),k ]∈ℝ(M − 1) × K: array of dimension (M − 1) × K storing the Travel Time in each segment [m,m + 1] at each day k (input data, in time units);

  • SD = [SD m,k ]∈ℝM × K: array of dimension M × K containing in each position the Schedule Deviation at TCP m in day k (in time units); SD m,k  < 0 indicates a delay, SD m,k  > 0 indicates an anticipation and SD m,k  = 0 indicates full compliance with the schedule.

Constraint (1.1) considers that the system starts on time at the first TCP, for all observations (SD 1,k  = 0). Constraint (1.2) defines each value of Schedule Deviation (SD m,k ) as the accumulated deviation from schedule in observation k at TCP m (considering all preceding TCPs). This constraint quantifies delays and anticipations in each TCP for each observation with respect to the planned schedule and accounts for deviation recovery through the behavior of drivers. When deviations from schedule occur, the model considers that drivers will adapt the average velocity (always complying with regulation) as an attempt to recover. This is included in the model through vector β, which stores a recovery percentage for each segment composing the daily path. A value of β m,m+1=0 means the driver cannot recover from deviations during the course between TCP m and TCP m+1, and a value of β m,m+1=1 means the driver would recover completely from the deviation during that segment. SD m,k is defined as the difference between the expected time (E(T) m,m+1 + τ m,m+1) and the real time (T (m,m+1),k ), plus the accumulated SD from the previous moment (SD m,k ) multiplied by the percentage of recovery from drivers (1 − β m,m+1 ). Values of SD m,k  < 0 indicate a delay from schedule, SD m,k  > 0 indicate that the bus is ahead from schedule, and SD m,k  = 0 indicate that the bus is on time.

Values composing the vector E ( T ) correspond to the current schedule. The values composing array T concern the real observations, which include the arrival time retrieved at each TCP in each day. The vector τ stores the slack time assigned to each TCP which optimizes the problem (decision variables).

Constraint (1.3) ensures that the arrival time for each TCP is always greater or equal to the arrival time of its antecedent TCP plus the minimum travel time necessary to travel that segment (N m,m+1). Note that the value of N m,m+1 is strictly positive. The slack time assigned to a specified TCP can assume a positive or negative value, provided that its joint duration with the correspondent expected time (E(T) m,m+1 + τ m,m+1 ) comply with the minimum time necessary to travel between two adjacent TCPs (N m,m+1).

Constraint (1.4) allows the user to define the total schedule time through the definition of parameter Ψ. In our empirical study this parameter was set to be equal to the total travel time of the daily trips in the original schedule.

4 Measuring Robustness of Bus Schedules

This section details the performance measures proposed in this study to assess the robustness of bus schedules. The indicators AD (Average Delay) and AA (Average Anticipation) represent the absolute values of delays and anticipations per TCP, respectively. The indicators PD and PA represent the proportion of delays and the proportion of anticipations per TCP, respectively. These indicators are also compared between the optimized schedule and the original schedule, to obtain relative performance measures, indicating the improvement attained through the optimization procedure. These other measures were called AD r , AA r , PD r and PA r , where the subscript r intends to highlight their relative nature, involving a comparison between the new schedule with the base schedule. The computations involved in the estimation of these measures are detailed in Table 1.

Table 1. Measures used to assess robustness of schedules.

Where:

  • M: number of TCPs;

  • K: number of days in the sample (i.e., observations);

  • SD: Schedule Deviation at TCP m under day k (in time units); SD m,k  < 0 indicates a delay, SD m,k  > 0 indicates an anticipation (see Eqs. (1.1) and (1.2));

  • ε: tolerance threshold defining the amount of time from which a delay or anticipation should be accounted (e.g., ε = 1 min), ε ≥ 0;

  • D: number of delays observed given a set of real observations, considering the threshold ε;

  • A: number of anticipations observed given a set of real observations, considering the threshold ε;

  • R: indicates a Robust schedule obtained with the model;

  • N: indicates the Nominal schedule currently in use by the transportation operator.

The measure AD, detailed in expression (2), returns the average time that the transportation service was delayed at each TCP. Analogously, measure AA, which is detailed in expression (6), returns the average duration of the anticipations per TCP.

The measure PD returns the proportion of delays observed in TCPs of the bus line during the period studied, which is computed according to expression (4). The transportation service is considered delayed when a bus arrives at a TCP after what was planned in the schedule and the amount of time delayed exceeds a tolerance threshold (ε) specified by the analyst or decision maker. This measure represents the probability that a client arriving on time at a TCP of the bus line will have to wait for a bus more than the time threshold specified. Similarly, the measure PA represents the proportion of anticipated arrivals in TCPs of the bus line. According to expression (8), a bus service is considered anticipated when its arrival at the TCP occurs before what was planned in the timetable, with a deviation from schedule larger than the tolerance threshold (ε).

Relative measures AD r , PD r , AA r , and PD r are adequate to assess the robustness of a schedule when compared with another schedule used as reference. In this study, the indicators associated with the schedule obtained using the optimization model (identified with an R, representing a Robust schedule) are compared with the schedule currently in use by the bus operator (identified with an N, representing the Nominal schedule). When these measures return values lower than one, it means that the new schedule performs better than the reference schedule.

5 Case Study

The case study used in the empirical part of this paper relates to a bus route operating in the city of Porto (Portugal). This section briefly presents this route, describing its geographic path and the shifts operating during a day. This section also describes the dataset used.

Figure 1 shows the map with the path travelled by this route, including the six TCPs selected by the bus company. The six TCPs of this route are: TCP1Campanhã”, TCP2Campo 24 de Agosto”, TCP3Marquês”, TCP4Carvalhido”, TCP5Viso” and TCP6Br. Sto. Eugénio”. The total distance between TCP1 and TCP6 (one way path) is 11.1 km, and it includes a total of 37 bus stops.

Fig. 1.
figure 1

Map of route 206 with identification of the six TCP.

On working days, the schedule of this bus route is organized into 5 shifts. The 5 shifts are coordinated to satisfy the demand of passengers as shown in Fig. 2. Table 2 provides information characterizing each shift, concerning the start, end and length of daily trips, alongside with the corresponding daily slack time, total number of bus stops composing the daily path of each shift (M) and the number positions under assessment for the case study (M∙K).

Fig. 2.
figure 2

Current schedule of route 206 including the five shifts for working days.

Table 2. Information of schedules currently operating in all shifts.

A set of real observations was gathered by the Portuguese company “OPT - Optimização e Planeamento de Transportes SA”, collected from GPS devices installed in all buses operating in Porto. The process of data collection was operationalized with a script gathering information automatically from each bus, with a periodicity of 5 min, during 3 weeks. Each record includes the identification code of the bus, the time of the observation and the estimated time that the bus would spend to achieve each TCP.

The resulting dataset was used to feed the model, including K daily observations with the arrival time at each TCP for each shift. The dataset encompasses the surveillance of 13 working days (i.e., excluding holidays, Saturdays and Sundays).

Figure 3 shows the daily observations (grey lines) alongside the current schedule (continuous black line) for each of the five shifts under study. It is possible to see that buses of all shifts typically arrive early and leave delayed from the inversion spot TCP5. They are often delayed when arriving to TCP6. In general, the shifts are able to comply reasonably with the plan during the first half of the day. The trips observed between 14 h 00 and 21 h 00 are systematically delayed. The last trip of shift 1 is an exception, as it always ends earlier than the schedule. Shift 3 has the most irregular pattern, including severe deviations from schedule throughout the day.

Fig. 3.
figure 3

Nominal schedule (black line), robust schedule R[1;1] (dashed blue line) and real observations (grey lines) for the five shifts of the case study (Color figure online).

6 Results and Discussion

This section presents the results obtained with the optimization model implemented using IBM ILOG CPLEX Optimization Studio, version 12.6, for the case study analysed.

The optimization model is endowed with freedom to reallocate slack time, preserving the total daily time of the current schedule, defined by Ψ (i.e., the total time between the beginning and the end of the daily trips). This constraint allows a fair comparison between the current schedule and the schedules obtained with the optimization model, as both have the same total duration of the daily trips, implying identical operational costs. All runs used a value of β equal to zero, meaning that efforts of drivers to minimize deviations from schedule are not accounted for. Thus, it is assumed that drivers adopt a similar behaviour in all trips between two given TCPs.

The model optimizes the robustness of individual shifts. Different viewpoints regarding the relative importance of delays and anticipations can be adopted. A neutral perspective is labelled as R[1, 1] (γ1 = 1; γ2 = 1), the exclusively minimization of anticipations as R[1;0] (γ 1 = 1; γ 2 = 0), and the fully minimization of delays as R[0;1] (γ1 = 0; γ2 = 1).

The measures defined in Table 1 were used to assess the robustness of the different schedules obtained with the optimization model. The results are summarized in Table 3, including different values for the tolerance threshold ε, allowing a broad depiction of the sensibility of the performance measures to changes in the threshold used for the evaluation of deviations from schedule.

Table 3. Robustness measures obtained for the five shifts.

Considering the length of deviations from schedule, when all shifts are taken into account we can observe that the average delay and anticipation from schedule is around 1.5 min. However, the delays and anticipations occur with different patterns in different shifts. For example, in shift 5 the average length of delays is considerably longer than the length of anticipations (2.6471 vs. 1.2874), whilst the reverse occurs for the other shifts. In the optimized schedules, the length of deviations from schedule could be reduced to values around one minute, which represents a notable overall improvement in relation to the current schedule, with an overall reduction between 67 % and 73 % of the length of delays of the original schedule and an overall reduction between 79 % and 82 % of the length of anticipations of the original schedule, depending on the scenario considered. The greatest improvements in terms of delays occurred for shift 5, whereas for shift 3 the improvements are less significant. Regarding anticipations, the improvement pattern is more homogeneous among the shifts, although shifts 1, 2 and 3 have slightly better results for the indicator AA r than shifts 4 and 5.

The analysis of the three scenarios with different priorities for the reductions of delays or anticipations (R[1;1], R[1;0], R[0;1]) revealed that the performance of the schedules obtained is not very sensitive to the specification of parameters γ 1 and γ 2.

Regarding the indicator evaluating the proportion of delayed or anticipated arrivals of the buses to the TCPs, we can see that the optimized schedule is particularly good in terms of avoiding extreme values of delays and anticipations. Thus, as the threshold of delays and anticipations increases, the performance of the optimized schedule in terms of the proportion of delays and anticipations improves significantly. For example, considering a tolerance value of ε equal to 3 min in the scenario R[1;1] giving equal importance to delays and anticipations, the new schedule reduces the probability of delays and anticipations to around 65 % of the values associated to the nominal schedule. Regarding the individual shifts, the greatest improvements to these indicators are possible for shifts 1, 2 and 5.

The visualization of the schedules obtained with the optimization model considering a neutral perspective on the relative importance of delays and anticipations (γ 1 = 1 and γ 2 = 1) is provided in Fig. 3. This plot represents the schedules returned by the model with a blue dashed line and the corresponding nominal schedule with a continuous black line, for each of the five shifts. Whilst the current schedule allocates slack time exclusively at path inversion TCPs (i.e., TCP 1, 5 or 6), the schedules obtained from the optimization model redistribute the slack time through all TCPs.

7 Conclusions

The main objective of this work was to endow bus schedules the ability to absorb uncertainty through their adjustment to historical data of real bus trips collected by ICT technologies. This involves the use of optimization techniques for the robust planning of bus schedules. This study proposed an approach to enhance existing schedules based on a mathematical programming model applicable to bus routes performing non-cyclic daily trips with non-even headways. The model uses real observations to learn the best distribution of slack time to be incorporated in each TCP integrating the daily path.

A set of measures was proposed and applied to quantify the robustness of bus schedules. The assessment of delays was quantified with the Average Delay (AD) and the Proportion of Delays (PD) at each TCP. These measures were complemented with relative measures (AD r and PD r ) allowing comparisons with a reference schedule. The assessment of anticipations was conducted using analogous measures: Average Anticipation (AA), Proportion of Anticipations (PA), and the relative measures AA r and PA r , involving direct comparisons with a reference schedule.

The validity of the optimization model was demonstrated with its application to a real-world case study, concerning a bus route operating in Porto, with five daily shifts. The schedules obtained with the model were able to minimize delays and anticipations in all shifts when compared with the corresponding original schedule. For the cases analysed in this work, the schedules obtained with different values of the parameters γ 1 and γ 2, reflecting the priorities assigned to improvements in delays or anticipations, were very similar. This demonstrates that this model does not need extensive experimentation for the calibration of the parameters.

The development of models and measures for the improvement of robustness of bus schedules is relevant to support managers in their effort to enhance urban transportation systems. The optimization model developed in this paper can contribute to improve existing schedules based on the data made available by ICT regarding the actual daily trips of buses. This type of quantitative modelling approaches, involving mathematical programming, can help translate the vast amounts of data currently collected by transportation companies into valuable information allowing the continuous improvement of mobility services provided to citizens.