1 Introduction

In this paper, we describe an optimization procedure for planning urban public transit networks. Public transit planning comprises a variety of interesting optimization problems. In network planning, one has to decide where rail tracks or roads and stops are to be established. The results of the line optimization step or the public transit network optimization step are lines or routes that connect stops on the network with certain frequencies. Timetabling assigns precise departure times to the routes for each station. The result of vehicle scheduling is an optimal allocation of vehicles to the trips of the timetable. Crew scheduling determines duties of 1 day for bus drivers and crew rostering finally combines duties to a sequence of 1-day duties. Crew scheduling and crew rostering have to meet many legal requirements to generate allowable crew schedules. Within this paper, we concentrate on establishing an optimal public transit network.

The construction of a transit network in urban areas is one of the strategic tasks of public transit companies. The basic objective is to establish a line concept which maximizes or minimizes a particular objective. Objectives can be customer-oriented or cost-oriented. While cost-based approaches minimize the total cost subject to certain requirements, customer-oriented approaches maximize passengers’ comfort. There are different definitions of this comfort; usual is the total number of direct travelers, the total number of needed transfers, or the total travel time of all passengers.

In practice, transportation planners have to analyze the benefits of a modified public transit service before improvements are realized. For example, planners use cost–benefit analysis to examine how many people will profit from a, e.g., new transit route. Moreover, it is important to anticipate total effects of the modified network. One option is to measure how many new passengers (within the entire network) can be expected when a new transit route is established. If we assume a fixed origin–destination matrix for each transit mode (car, walking, cycling, and public transit) we have to ignore effects of a changing modal split within the optimization process.

In contrast to conventional optimization approaches, we do not consider a fixed public transit origin–destination matrix. We assume that the number of passengers depends on the public transport service. In this paper, we introduce a new customer-oriented objective—the total number of expected public transit passengers. Finally, we construct networks with the maximum number of expected passengers subject to a fixed budget. On the one hand, we are able to maximize the expected number of passengers under the constraint of a fixed budget. On the other hand, we also could minimize the total costs subject to a minimum number of expected passengers. Hence, it is possible to relieve the public purse without a loss of network quality.

The benefits of this simultaneous approach are obvious: We are able to evaluate the emerging network regarding the expected passengers’ reaction. As a result, we close a gap between (mixed integer) linear optimization and the demand forecast using discrete choice theory. Here, we apply a binary logit model to anticipate the passengers’ reaction.

The article has the following structure: First, we present a short review of the literature on network optimization in public transportation. After that, we develop an integer programming model for maximizing the expected number of transit passengers. In the following section, we briefly discuss how to model the demand for public transport. Then, we exhibit parallel running routes as a phenomenon of esp. urban public transport services and their effects on expected waiting times. Afterward, we explain the data structure, i.e., the extended arc and path definition. Next, we present typical properties and results of the optimization model by means of a small example. Finally, the results of computational experiences with real-world data sets are described.

2 Related work

In the literature, we find many optimization algorithms for the transportation network design problem. We can classify them by the objective, by the used solution approach or by the assumptions made. A comprehensive survey of exact approaches (closed-form linear optimization problems) for line optimization can be found in Schöbel (2011).

The first paper dealing with the line planning problem was published by Patz (1925). It contains a heuristic algorithm for minimizing empty seat kilometers. A given demand for public transit is thus fulfilled by a service with relatively heavily used vehicles. This was indirectly a first cost-oriented algorithm. An advanced cost minimizing approach is presented by Claessens et al. (1998). It incorporates the decisions of choosing routes and their frequencies. Other cost-oriented approaches are presented by Bussieck (1998) and Bussieck et al. (2004). In Goossens (2004), we find a model which additionally allows different route types (e.g., express lines which do not stop at every station). Typical for cost-oriented models is the minimization of the operating costs while it is ensured that all passengers are able to reach their destinations. The definition of costs differs between the published articles.

However, a wide range of publications are related to the maximization of the number of passengers who are able to reach their destinations without transfers (direct passengers). This can be seen as a customer-oriented objective. A first, three-stageFootnote 1 approach was presented by Dienst (1978). This procedure was extended by Oltrogge (1994) by introducing the so-called system split, i.e., the assignment of passengers to different means of transport (e.g., bus and tram) before the routing of the passengers. However the disadvantage of direct passenger maximization is that passengers who need to transfer are ignored.

Other customer-oriented papers deal with the total driving time or the discomfort of all passengers. Borndörfer et al. (2007, 2008) minimize the total driving time where the complete demand for public transit has to be fulfilled. The special feature of their first model is that lines are created from the scratch within the optimization process. In addition to in-vehicle time (IVT), Nachtigall and Jerosch (2008) consider a penalty for each required transfer. Overall, the inconvenience of all passengers is minimized. Scholl (2005) developed a model for minimizing the total number of needed transfers by all passengers. She clarified among others that the two objectives—maximizing the number of direct passengers on the one hand and minimizing the number of transfers on the other hand—commonly result in different line concepts.

Shimamoto et al. (2012) present a multi-objective optimization model where passenger route–choice behavior is explicitly considered. It contains a detailed traffic assignment which accounts for parallel running routes with vehicle capacities.

If we want to minimize the total driving time of all passengers or the total operating costs, we need a constraint that the complete demand for public transit has to be fulfilled. Thus, we need the demand for public transit as an input. But knowing the public transit demand independently of the public transit network is not very realistic. Therefore, we put the fixed public transit origin–destination matrix into question.

Klier and Haase (2008) present an optimization model without a fixed public transit demand. They use a linear demand function of the expected travel time to model the impact of travel time on the demand for public transit. To measure expected waiting times, it is assumed for reasons of simplification that vehicles of parallel transit routes arrive equally distributed at stops. This approach is, however, not ready for solving real-world instances.

Guihaire and Hao (2008) provide a review about transit network design procedures, which focuses not only on exact formulations but on heuristic approaches as well. They propose the terminology transit network design and frequency setting problem (TNDFSP). The review of Kepaptsoglou and Karlaftis (2009) denotes the problem as a transit route network design problem. This overview classifies a wide variety of primarily heuristic methods. Among these heuristics are some that take an elastic demand into account.

Obviously mathematical optimization models that incorporate the forecast of public transit demand within the transit network optimization on the one hand and are applicable for real-world instances on the other hand are still missing in the literature. For this reason, we developed the model presented in this paper.

The main feature of our optimization approach is that we consider passengers’ reaction to the public transit service. We can already find the inclusion of demand reactions in optimization processes in other application fields. Haase and Müller (2013) and Müller et al. (2009) present models for optimizing school locations with consideration of students’ school choice. Haase and Müller (2014a) contain two formulations for optimizing the locations of stores in a network of a parcel system. With these approaches, competition effects between stores of the same network and competing networks can be modeled. Haase and Müller (2014b) discuss a model for location planning of health care facilities regarding the choice behavior of clients.

3 Optimization model

In this section, we present a combinatorial optimization model that incorporates the reaction of transit users.

3.1 Graph definition

We assume that the urban area is subdivided into districts (set \(R\)). The graph (Fig. 1) underlying our model consists of nodes \(s, s' \in S\) representing public transit stops and access points to the districts \(r \in R\). The stops are connected by driving arcs \(a^d_{s,s'}\), for which a public transit driving time is known. Stops and districts are connected by walking arcs \(a^w_{r,s}, a^w_{s,r}\). To each walking arc, an average walking time is assigned.

Fig. 1
figure 1

Graph definition

Transit routes \(\ell \in L\) consist of adjacent driving arcs (\(a^d_{s,s'}\)). A travel path \(w \in W\) through the graph is a path from one district connection node \(r\) to another one. Assuming that we know the transit routes \(\ell \) and frequencies \(f\) which are needed to establish path \(w\) (set \(LF_w\)), we are able to predict the expected travel time consisting of walking time, waiting time, driving time, and with these characteristics the expected demand (see Sect. 4.1).

3.2 A model for maximizing the demand for public transit

To establish path \(w\), all route–frequency combinations \(LF_w\) must exist. With the set \(LF_w\), we know exactly which transfers on path \(w\) are necessary, and with the known frequencies of the transit routes, we can pre-estimate expected waiting times. We, furthermore, can predict for each \(w\) the expected travel time, and finally, the expected number of public transit passengers \(b_w\) on the connected relation. The set \({W}_{uv}\) contains the paths \(w\) that connect the districts \(u\) and \(v\). Let \(C\) be the available budget and \(c_{\ell f}\) the costs for operating route \(\ell \) with frequency \(f\). Our task is to select a set of transit routes, and for each selected route \(\ell \) the frequency \(f\). For this, we define the binary variables \(y_{\ell f}\). With the binary variables \(z_w\), which take the value of one, if path \(w\) is established, we obtain the following model:

$$\begin{aligned}&\max \sum _{w} b_{w} \cdot z_{w} \end{aligned}$$
(1)
$$\begin{aligned} \sum _{w \in {W}_{uv}} z_{w}&\le 1 \quad \quad \qquad \forall (u,v) \in {R}^2| u \ne v \end{aligned}$$
(2)
$$\begin{aligned} z_{w}&\le y_{\ell f} \quad \qquad \forall \ell ,f,w| (\ell ,f) \in LF_{w} \end{aligned}$$
(3)
$$\begin{aligned} \sum _{f} y_{\ell f}&\le 1 \quad \quad \qquad \forall \ell \in L \end{aligned}$$
(4)
$$\begin{aligned} \sum _{\ell }\sum _{f} c_{\ell f} \cdot y_{\ell f}&\le C \quad \quad \end{aligned}$$
(5)
$$\begin{aligned} y_{\ell f}&\in \{0,1\} \quad \quad \forall \ell \in L, f\in F \end{aligned}$$
(6)
$$\begin{aligned} z_{w}&\in \{ 0,1\} \quad \quad \forall w \in W \end{aligned}$$
(7)

The objective (1) maximizes the total number of expected passengers. For each relation at most one path can be selected (2). Thereby we assume that passengers travel on one unique path between two districts. If path \(w\) is offered, all of the corresponding transit routes and frequencies have to be established (3). We assign to each transit route \(\ell \) at most one frequency (4). The budget must not be exceeded by the costs of the established routes (5). All decision variables are binary (6 and 7).

3.3 Coupling the path variables and the route variables

The coupling constraints (3) can be replaced by the aggregated formulation

$$\begin{aligned} \sum _{w| (\ell ,f) \in LF_{w}}z_w \le M \cdot y_{\ell f} \quad \forall \ell \in L, f\in F. \end{aligned}$$
(8)

where \(M\) denotes a sufficiently large number. Resulting from this reformulation, a problem of a considerably reduced size emerges. As one typical example, we show the development of the problem size with the two types of coupling constraints (I: disaggregated, II: aggregated) for a real-world instance in Table 1.

Table 1 Comparison of the problem sizes resulting from the disaggregated (I) and the aggregated (II) formulation of the coupling constraints

For real-world data sets, we use both of the formulations. The aggregated coupling constraints (8) lead commonly faster to feasible solutions while the disaggregated formulation (3) converges faster because of tighter LP-bounds.

In contrast to the model presented in Klier and Haase (2008), we here neglect the vehicle utilization. Therefore, we do not check explicitly whether capacity is exceeded or not. The difficulty lies in the fact that the travel demand depends on the trip purpose and the daytime. Commonly, there is no homogeneous peak hour to measure the maximum utilization for a capacity constraint. For example, the peak hour of trips to school or to work differs from the peak hour of shopping trips. If we concentrate on one global peak hour, we possibly make a mistake for parts of the network where shopping trips are essential. For simplification, we assume that there is enough capacity as long as the routes are established. The capacity is determined by the vehicle size and mainly by the frequency while the demand for public transit depends on the travel time. If the frequency of a transit route is low, the waiting time is relatively high. Since we define the arrival of passengers at stops as a random process (even for low frequencies), we overestimate waiting times. This leads to a decreasing demand for public transit and hence to a lower vehicle utilization in our model. Thus the overestimation of the waiting time at low frequencies works at least partially as a capacity constraint.

4 Demand modeling and data structure

In this section, we first present how to model the demand for public transit. After that we show an approach to determine waiting times in urban areas. Remarks on the sets on which the optimization model is based on and their construction close this section.

4.1 Modeling the demand for public transit

The main focus of this paper is the reaction of potential passengers to the established public transit service. The most important decision we are interested in is the mode choice. We use a binary logit model as a discrete choice approach. Discrete choice models (Ben-Akiva and Lerman 1985) are developed for estimating probabilities for choosing one of a set of discrete and disjunctive alternatives. A binary logit takes exactly two alternatives (e.g., public transit and other modes of transport) into account.

A variety of studies deal with the impact of the fare on public transport demand. In contrast, relatively few published studies deal with influences of the service quality on demand. Wardman (2004) and Paulley et al. (2006) show that the strength of the influence depends on the type of time. So the elasticity of demand for public transit with respect to IVT is between \(-0.4\) and \(-0.6\), with respect to waiting time \(-0.64\) (Paulley et al. 2006). It is also possible to measure different weightings by calculating values in terms of IVTs. Wardman (2004) shows that one minute waiting time for buses has the same valuation as 1.6 min IVT while the value of walking times vary between 1.4 and 2.0 units of IVT.

Usually the utility functions of logit models correspond to individuals. Here, we do not consider individual specific factors (e.g., sex, age, income, and car availability). Thus, we use our exemplary linear utility function for path \(w\) instead of individual \(n\). The function contains the exogenous variables IVT \(t^{\mathrm{drive}}_w\), waiting time \(t^{\mathrm{wait}}_w\), walking or access time \(t^{\mathrm{walk}}_w\), and the distance \(d_w\). Times are given in minutes and distance is measured in kilometers. Assuming a fixed destination choice in which the individuals have to overcome a specific distance to their destination using path \(w\). With an increasing distance, the utility of public transit also increases if the travel time is fixed (the effective velocity must increase). Here, we use typical properties of published studies on demand elasticities in public transit. A wide range of studies were summarized by Balcombe et al. (2004). Accordingly, one can assume a relatively inelastic demandFootnote 2 with respect to waiting time to IVT and to walking time. The deterministic utility \(V_w\) for public transit that we use is given by Eq. (9).

$$\begin{aligned} V_{w}=-0.5 +0.1 \cdot d_w -0.0195 \cdot t^{\mathrm{drive}}_w -0.0381 \cdot t^{\mathrm{wait}}_w -0.03 \cdot t^{\mathrm{walk}}_w \end{aligned}$$
(9)

The expected demand for public transit on path \(w\) is the product of the logit probability for choosing the alternative “public transit” and the total demand from \(u\) to \(v\). The overall travel demand \(\mathrm{od}_{uv}\) is the sum of trips made on an average working day over all means of transport and over all trip purposes.

$$\begin{aligned} b_{w}= \frac{1}{1 + \mathrm {e}^{-V_{w}}} \cdot \mathrm{od}_{uv} \qquad \forall (u,v), w \in W_{uv} \end{aligned}$$
(10)

where \(W_{uv}\) denotes the set of all potential paths which connect the districts \(u\) and \(v\) with the expected public transit demand \(b_w\). We assume that the choice probability for public transit depends solely on the best established transit path between two districts. As a consequence, all passengers from \(u\) to \(v\) will take the same path. Here, we concentrate on mode choice and neglect route choice aspects depending on preferences of passengers.

4.2 Expected waiting times and parallel public transport routes

Characteristic for urban areas are situations where more than one route connect two stops directly. In the literature, this situation is discussed under the term Common Line Problem (Chriqui and Robillard 1975). For passengers, this could mean a significant reduction in waiting times. Ideally, the expected waiting times can be halved if a second route operates with the same frequency between two stops (see Fig. 2). We denote the headway of route \(\ell \) by \(h_\ell \). Obviously, the waiting times depend on the frequency a route is operated with and the time lags between the arrivals of routes. The time lag of route \(\ell \) is derived from the departure time of its first vehicle, denoted by \(\Delta _\ell \).

Fig. 2
figure 2

Waiting times for two parallel routes with a headway of 30 min and different start time lags, \(\Delta _\ell \). In the first situation (top), the first vehicles of route 1 and route 2 arrive at the same time (\(\Delta _1 = 0, \Delta _2=0\)). In the second situation (bottom), the first arrivals are displaced by 15 min (\(\Delta _1=0, \Delta _2 = 15\)). The expected waiting time in situation 2 is only the half of it in situation 1

Comparing the two situations of Fig. 2, we recognize that vehicles of the two parallel routes arrive on the one hand at the same time at the considered stop and on the other hand the first vehicle of route 2 arrives 15 min after the first vehicle of route one (\(\Delta _1=0, \Delta _2=15 \Rightarrow \Delta _{2}-\Delta _{1}=h_1/2\)). The expected waiting time for randomly arriving passengers is 7.5 min instead of 15 min if both routes depart at the same time. Thus, we have to take into account the effects of parallel routes on the expected waiting times.

The operation model we use works as follows: We modeled passenger arrivals at a stop by exponentially distributed times between arrivals (\(t^{\mathrm{betw}}_n\)). The arrival time of individual \(n\) is computed by \(t^{a}_{n} = t^{a}_{n-1} + t^{btw}_n\). The time between the arrivals of vehicle \(b-1\) and vehicle \(b\) of transit route \(\ell \) is modeled as a normally distributed stochastic variable \(t^{btwt}_{b,\ell }\). The arrival time of vehicle \(b\) results from \(t^{at}_{b,\ell } = t^{at}_{b-1,\ell } + t^{btwt}_{b,\ell }\). The arrival of the first vehicle \(t^{at}_{0,\ell }\) of line \(\ell \) is modeled by the between 0 and \(h_\ell \) uniformly distributed variable \(\Delta _\ell \). The waiting time of individual \(n\) follows from \(t^{\mathrm{wait}}_n = \min _{\ell ,b | t^{at}_{b,\ell } \ge t^{a}_n} t^{at}_{b,\ell } - t^{a}_n\). Having in mind our real application, we simulated the average waiting times for \(q=1,\ldots ,69\) scenarios with at most four parallel routes (\(i=1,2,3,4\)), different headways and stochastic passenger and vehicle arrivals. While the arrival process of passengers is defined as a poisson process, the inter-arrival times between two vehicles of the same route are normally distributed with a standard deviation of 10 % of the (planned) headway. An extract of the simulation results can be seen in Table 2. Scenario 43 contains, for example, four parallel routes with the headways \(h_{qi}\) (\(h_{43,1}=10, h_{43,2}=10, h_{43,3}=20, h_{43,4}=30\)) which are operated on one relation. In this case, the simulation of 1,000 runs with 500 simulated passengers each produces an average waiting time of 2.69 min (\(\bar{t}_{\mathrm{wait}}\)). The theoretic waiting time \(t_{\mathrm{wait}}^{\mathrm{naive}}\) would occur if all arrivals of all routes were perfectly evenly distributed (cf. 11). This is a systematic underestimation of the expected waiting time. For scenario \(q=43\), the simulated waiting time is on average 52.8 % higher than the naive one.

$$\begin{aligned} t_{\mathrm{wait},q}^{\mathrm{naive}}&= \frac{1}{2\cdot \sum _{i|h_{qi}>0} \frac{1}{h_{qi}}} \\ t_{\mathrm{wait},43}^{\mathrm{naive}}&= \frac{1}{2 \cdot \left( \frac{1}{10} + \frac{1}{10} + \frac{1}{20} + \frac{1}{30} \right) } = 1.765 \, (\min ) \nonumber \end{aligned}$$
(11)

The difference between \(t_{\mathrm{wait}}^{\mathrm{naive}}\) and \(\bar{t}_{\mathrm{wait}}\) increases if the arrivals at the stops are more erratic. This occurs when the arrivals are more stochastic (e.g., delays) and when the headways differ strongly between the routes.

Table 2 Simulation results for waiting times under parallel routes

Within the simulation process, we also count the passengers that board a vehicle of each observed route. This results in the shares passengers are split on the parallel running routes (not shown in Table 2). The shares depend not only on the frequencies but also on the driving times (e.g., detours) of each route. If the driving times of (partial) parallel running routes differ between the same pair of connected stops, we have to weight the driving times by the shares of passengers boarding the routes to compute average driving times.

Remarks on random arrivals of passengers: In the simulation experiment described above, we used exponentially distributed inter-arrival times of the passengers. This means that passengers do not know the schedule or ignore it while planning their trips. This is a restricting assumption, e.g., for routes with low frequencies. It would be more realistic to differentiate passengers who arrive randomly at the access stop and passengers who planned their arrival to minimize the waiting time. We forbear from regarding this behavior because of our data structure. The paths \(w\) through our network are based on public transit arcs that contain the driving time plus the expected waiting time (see Sect. 4.3). We do not distinguish between arcs with waiting time at the access station and those with waiting times at a transfer station. Network design is about establishing routes and frequencies, but not the exact timetable. Hence the times between arrivals of different routes are unknown. Therefore, we do not know the waiting times at transfer stations. Passengers possibly plan the arrival at the first public transit station to minimize waiting time, but they can not do this for transfers. This is the reason for assuming randomly arriving passengers and simulating the arrival time of the first vehicle of one route (\(\Delta _\ell \)) as uniformly distributed variables.

4.3 Path generation and pre-processing

For each combination of two stops directly connected by a route, we create a new public transit arc \(\hat{a}_{s,s'}\). The average waiting time \(\bar{t}_{\mathrm{wait}}\) in addition to the driving time is assigned to each arc. Thus, we can use route-specific waiting times. The expected waiting times at a bus stop depend on the frequencies of the routes. Therefore, we introduce one arc for each combination of route, frequency, and the pair of stops connected by the route.

The number of public transit arcs \(\hat{a}\) increases substantially if we want to take waiting time effects of parallel running routes into account. Therefore, we need one arc for each combination of stops which are connected by the parallel running routes and the combinations of frequencies of the involved routes (see Fig. 3).

Fig. 3
figure 3

This figure shows the required arcs for two partial parallel running routes (a) with two possible frequencies each. In this situation, we need 36 arcs (b). Both routes provide each six directly connected pairs of stops [route 1: (1,2), (1,3), (1,4), (2,3), (2,4), (3,4), route 2 analogously]. This six connections can be established with two different frequencies. So we need 12 arcs for each route for the direct connections (provided by only one of the two routes). Furthermore, we need 12 additional arcs to present the three directly connected pair of stops if both routes are established [(2,3), (2,4), and (3,4)]. Here, for each relation we need four frequency combinations of two routes with two frequencies. c is a magnification of relation between station 2 and station 3. To each arc, a combination of frequencies is assigned. Therefore, we can assign an average waiting time to each arc

A path \(w\) uses exactly two walking arcs \(a^w\) (from the origin district to the access stop and from the egress stop to the destination district) and one, two, or three adjacent transit arcs \(\hat{a}\) (up to two transfers). Figure 6 shows one concrete path including two transit arcs (one transfer) corresponding to the example described within the following section.

One problem is to find all promising paths without ignoring those which can be part of an optimal solution of the problem and without creating a vast amount of variables \(z_w\). This is done in a time-consuming pre-processing. Within this, we compare paths regarding the expected utility for the passengers and the involved transit routes and frequencies. The enumerating pre-processing works as follows:

  1. 1.

    Compute the shortest paths for all origin–destination-relations \((u,v)\) based on the public transit driving time.

  2. 2.

    Check which transfer stations \(s\) (for one or two transfers) are possible for each OD-pair if the total driving time of the path must not exceed 1.5 times the shortest driving time.

  3. 3.

    Check which combinations of public transit arcs are feasible for creating paths corresponding to the transfer stations determined in step 2 with at most two transfers. Each route shall be contained at most within one arc (no transfers within the same route allowed). The arcs must be adjacent and any nodes may be visited not more than once.

  4. 4.

    Create all paths by assembling feasible combinations of arcs found in step 3. Check if there is another path with the same or a subset of route/frequency combinations with an equal or higher utility. If such a path already exists, do not store the created one. If the new one has a higher utility value (with the same or a subset of route/frequency combinations), delete all dominated existing paths. For real-world instances, this step needs a lot of time because it has to be checked if a new path dominates (or is dominated by) any of the already generated paths. Store routes \(\ell \) and frequencies \(f\), which correspond to the used transit arcs, within the set \(LF_w\).

5 Introductory example

We illustrate the approach by means of a small example containing seven districts, \(R = \{r_1,r_2,\ldots , r_{7}\}\), and ten stops, \(S = \{s_1,s_2, \ldots , s_{10}\}\) (see Fig. 4). Suppose that there are nine potential routes \(L = \{\ell _1, \ldots , \ell _9\}\) (see Fig. 5 left). Each of them can be operated with headway of either ten or thirty minutes. The costs for each combination of route and frequency are proportional to the operating time (see Fig. 5 right).

Fig. 4
figure 4

Example with seven districts (\(r_1, \ldots , r_7\)) and ten stops (\(s_1, \ldots , s_{10}\)). The solid lines represent the public transit network sections weighted with the public transit driving time in minutes. The dashed connections are the walking edges weighted with the average walking time in minutes

Fig. 5
figure 5

Pool with nine possible routes (left) and costs for route \(\ell \) operated with a frequency \(f\) (right)

Within the procedure of pre-processing 324 paths for the 19 relations with a symmetric positive demand (upper triangular matrix in Table 3) are generated. This leads to an optimization problem with 682 rows, 343 columns, and 1,989 non-zeros. Figure 6 contains one possible path \(w\) from district \(u=r_1\) to district \(v=r_3\). In this situation, the transit routes \(\ell _6, \ell _8\), and \(\ell _9\) must exist (with certain frequencies). This path implies one transfer at station \(s_8\). In conjunction with our findings from Sect. 4, we expect a choice probability for public transit of 0.35. In Fig. 7, we show the results for maximizing the total demand for public transit. One typical characteristic of the results consists in diminishing marginal increases of the demand with increasing budget.

Table 3 The total travel demand from origin to destination (upper triangular matrix) contains the total number of trips per day irrespective of the mode of transport

In Fig. 7, we compare the maximization of the number of expected passengers with the maximization of the number of direct passengers on the one hand (left). The maximization of the number of direct travelers is based on direct transit paths (without transfers). Within the pre-processing, we construct only direct paths that can be established. We weight the paths by the total travel demand \(\mathrm{od}_{uv}\) and maximize the number of potential passengers who are able to travel directly by public transit. For each path, we predict the demand for public transit and compare the expected total transit demand with the transit demand of the demand optimal network (\(D^*\)). On the other hand (right), we show the difference in expected public transit demand \(D^{\mathrm{expected}}\) between pure demand maximization and the maximization of the expected demand as a service of general interest (SGI). In the latter approach (SGI) it is (e.g., politically) required that all relations are made possible by public transport. This is achieved by the fact that constraint (2) changes from a less than or equal (\(\le 1\)) to an equality condition (\(=1\)). For a reasonable application, we initially have to determine the least required budget.

Fig. 6
figure 6

Subnetwork with three routes for connecting the districts \(r_1\) and \(r_3\) with driving and walking times (left) and the resulting path \(w\) (right). Route 6 and 8 are operated with a headway of 10 min, route 9 with a headway of 30 min. Passengers walk from \(r_1\) to station \(s_1\) (2 min) and drive with route 6 to station \(s_8\). The arc from station \(s_1\) to station \(s_8\) is weighted with 9 min in-vehicle time and the expected waiting time of 5.039 min (scenario 1 in Table 2). In station \(s_8\), passengers transfer to the first vehicle of either route 8 or route 9. The transit arc from \(s_8\) to \(s_6\) is weighted with the average in-vehicle time of 11.34 min and the expected waiting time of 4.464 min (see scenario 8 in Table 2). At station \(s_6\), passengers alight from route 8 or 9 and walk to district \(r_3\) (10 min). This path \(w\) consists of two walking arcs and two transit arcs (\(\hat{a}\)) with a total walking time \(t^{\mathrm{walk}}_w\) of 12, a total in-vehicle time \(t^{\mathrm{drive}}_w\) of 20.34 and a total waiting time \(t^{\mathrm{wait}}_w\) of 9.503 min. The deterministic utility \(V_w\) is \(-\)0.6187 and the probability for choosing public transit is about 0.35. The set \(LF_{w}\) of this path \(w\) contains the route–frequency combinations \((\ell =6,f=6),(\ell =8,f=6),(\ell =9,f=2)\)

The comparison shows, that the three optimization approaches lead typically to different transit networks. Assuming that the expected number of passengers is a better approximation for the effective quality of the transit network (than the number of direct passengers) we get better results than the maximization of the number of directed passengers. Especially for small budgets, the quality differs considerably. Let us look at the results for a budget of 670 (cf. left side of Fig. 7). In this extreme case, the maximum number of expected passengers is 294.52 while the direct passenger optimal network attracts only 226.01 expected passengers. If we assume nearly constant average fares per trip, we achieve about 30 % higher revenues by the demand maximum solution. In this example, the expected demand for public transit of the demand optimal networks is on average 13 % higher than the expected demand of the direct passenger networks.

If we compare the networks corresponding to pure demand maximization and to demand maximization as SGI, we can identify the cost of the political restriction that all districts have to be connected. For example, let us focus on the solutions for rather small budgets (right side of Fig. 7). The expected demand is at most about 11 % higher than the demand resulting from the SGI network (248.87 vs. 223.83 passengers for \(C=105\)). This means that the transportation company or the local authority foregoes 25 passengers per day in favor of giving all residents the opportunity to use public transit. On average, the pure demand maximization leads to a gain of about 0.8 %.

Fig. 7
figure 7

Left: comparison of the expected demands resulting from public transit networks after maximizing the expected passengers (\(D^*\)) and maximizing direct passengers (\(D^{\mathrm{direct}}\), left) Right: comparison of the expected demands of demand maximized networks and those resulting from maximizing the expected passengers as a service of general interest (\(D^{\mathrm{SGI}}\), right). The gray areas show the gain in expected passengers by demand maximization. Note that the curve \(D^*\) on the right side is a zoom-in of the \(D^*\) curve on the left side

Figure 8 shows that the expected public transit demand of single relations may decrease with an increasing budget. An increase of the budget from 43 to 44 leads to the result that the demand of relation (\(r_3-r_4\)) decreases from 87.69 to 84.95, but the global optimal demand increases from 191.3 to 200.06 passengers per day. If we compare the two situations in detail, we observe that the demand optimal line concept contains the public transit routes 3 and 7 (headway 30 min) for the budget \(C=43\). If we increase \(C\) by one, it is possible to substitute route 3 by route 5 with a headway of 30 min each. The two situations are visualized in Fig. 9. The reasons for the decline of the public transit demand on relation (\(r_3-r_4\)) are that the driving time increases from 7 min (\(s_4-s_5-s_6\)) to 19 min (\(s_4-s_5-s_2-s_3\)). On the other hand, the access time (walking) is reduced significantly from 14 min (\(r_3-s_6\) and \(s_4-r_4\)) to 8 min (\(r_3-s_3\) and \(s_4-r_4\)). The expected waiting time is constant because of the unchanged headway of the only public transit route that is used from \(r_3\) to \(r_4\). The deterministic utility term (inserted into Eq. 9) of the relation (\(V_{r_3r_4,w})\) decreases from \(-0.5 + 0.1 \cdot 10 - 0.0195 \cdot 7 - 0.0381 \cdot 15.124 - 0.03 \cdot 14 = -0.6327 \) to \(-0.5 + 0.1 \cdot 10 -0.0195 \cdot 19 - 0.0381 \cdot 15.124 - 0.03 \cdot 8 = -0.6867\). The choice probability for public transit decreases from 34.69 to 33.47 % and thus the expected absolute demand declines.

Fig. 8
figure 8

Demand for public transit under different budgets for single relations

Fig. 9
figure 9

Optimal networks for \(C=43\) (left) and \(C = 44\) (right)

For example, we get for a budget of \(C=153\) (cf. Fig. 10) for maximizing the number of direct passengers a public transit network with the routes 1, 5, and 8 (each with a headway of 30 min), for maximization of the expected demand as service of general interest SGI (one public transit connection for each relation), the lines 1 and 5 (headway 30 min) and the routes 7 and 9 with a headway of 10 min (thick line). The demand maximal transit network contains the routes 3 and 7 (operated every 10 min) and route 5 (with a headway of 30 min). If we compare the three solutions, we realize that the expected number of public transit passengers varies. From this point of view, of course, the best solution is the last one with an objective of 269.81 passengers (left), the second best is solution two (265.84), the worst is the network resulting from direct passenger maximization with 223.83 expected passengers (right).

Fig. 10
figure 10

Optimal networks for \(C=153\) under maximization of the expected demand (left), maximization of the expected demand as a service of general interest (middle), and maximizing direct passengers (right)

6 Experiences with real-world data

We applied the model described in Sect. 3.2 to data for the city of Dresden, Germany. The underlying reduced graph consists of 192 stops and 652 arcs. We tested the performance on ten scenarios varying the size of the line pool \(| L|\), the number of alternative frequencies \(| F|\), the number of considered origin destination pairs \(| \mathrm{OD}|\), and the maximal number of parallel routes, denoted by \(\mathrm{pr}\) for measuring the expected waiting times. Table 4 shows the characteristics of the scenarios. For each scenario, we let the budget vary within a wide range and compare the solutions of a direct traveler approach and the maximization of the demand by the expected number of passengers. The scenarios 1, 2, 3, 9, and 10 result in problems we generally could not solve to optimality with the solver CPLEX 12.2 called by GAMS 23.5 running on a Intel Xeon X5550 (2.67 GHZ, 24 GB RAM). Problems are hard to solve if the number of frequencies and the number of parallel routes or the number of OD pairs are relatively large. For scenario 2, we built a set with 45,594 public transit arcs. This leads to 1,213,045 paths and hence to a problem with 4,455,643 rows, 1,213,247 columns, and 11,524,330 non-zero elements. Scenario 9 considers 17,240 pair of districts. Even though only one frequency was allowed, the problem was really hard to solve. The number of relations that are to be connected seems to be essential.

Table 4 These are the characteristics of the ten test instances (\(\mathrm{sc}\)) with varying number of OD pairs ( \(|\mathrm{OD} |\)), maximum allowed transfers (\(\mathrm{tr}\)), size of line pool (\(| L |\)), frequency pool (\(| F |\)), and maximum numbers of parallel route (\(\mathrm{pr}\)) for measuring waiting times. The subsequent values \(| \hat{A} |\) (cardinality of the set of public transit arcs needed), \(| W|\) (cardinality of the set of paths), the number of rows, columns, and non-zeroes illustrate the problem size

Table 5 contains an extract of optimization results for the scenarios shown in Table 4 with different budgets \(C\). The scenario to which the instance belongs to is denoted as sc while ms is the resulting modal split for public transit of the best known network. The computational time (measured in seconds) we find in the column CPU and the optimality gap at the time the solving procedure was stopped (if no optimal solution could be found within a reasonable time) in the next column.

Table 5 This table shows optimization results for different scenarios (sc) under varying budgets (\(C\))

To show the benefits of our approach, we present \(\lambda \) as the relative increase of the expected demand resulting from maximization of the number of expected passengers compared to the maximization of the number of direct passengers (\(\lambda = 100 \cdot (D^*/D^{\mathrm{direct}}-1)\)). We observe that this increase is normally considerable, even if we could not prove optimality for our best known solutions. In spite of the fact that the optimality gap is quite large (31.43 %) for scenario 2 at a budget of 1,000, we showed that the expected demand for the optimal direct passenger network is more than 5.23 % higher. Assuming an average trip fare (including single and season tickets) of 1.50€ and 220 working days per year, we would increase the expected revenues by \(1.5 \cdot 220 \cdot (33,323-31,666) = 546,810\)€ per year. In extreme cases, the difference between the two objectives is remarkable with up to 17.5 % (scenarios 6 and 7, \(C=1,000\)) which leads to an increase in expected sales of \(1.5 \cdot 220 \cdot (27,742-23,606) = 1,364,880\)€ per year. The largest growth, naturally, we get for scenario 10, because of the larger OD-matrix. We would reach additional 14,995 passengers per day, which leads to an increase of the expected revenues of 4,948,350€ per year. Even though it is not very realistic that transportation planners maximize the number of direct passengers to create the transit network, we show that in general the maximization of the expected demand explicitly leads to networks that attract remarkably more passengers.

7 Summary and outlook

In contrast to current studies on public transit network optimization in this paper, we firstly presented an innovative and novel approach for maximizing the expected total number of public transit passengers. For this, we incorporated the reaction of potential users on a given service while we created the service in a combinatorial optimization model. For each relation, the probability for choosing public transit depends on expected access, waiting, and IVT and the traveled distance. For waiting times, we considered (partial) parallel routes which are typical for urban areas. By means of a small example, we showed the effectiveness of the proposed model. Here, we focused on some typical results, like decreasing demand on single relations at an increasing budget or the different results for an SGI (all relations must be available).

Future research should deal with the development of advanced solution methods like decomposition techniques (e.g., column generation and Lagrangian relaxation) or heuristic approaches. Furthermore, the impact of fares on the mode choice of the passengers could be incorporated. With this extension, a revenue maximizing transit network design would be possible. Moreover, efficiency effects could also be included, if additional social costs (e.g., environmental or congestion cost) are taken into account (e.g., Tscharaktschiew and Hirte 2012).