1 Introduction

Urban rail transit (URT) networks play a major role in facilitating residents' travel behavior and alleviating traffic congestion. However, with the complex spatial and temporal distribution of passenger flow in China’s URT network, the mismatch between passenger demand and transportation supply during peak hours is becoming increasingly obvious. Passenger congestion is extremely serious at some key stations, and emergency management also faces huge challenges. These problems ultimately lie in the incongruity between the network structure and the spatial and temporal distribution of urban residents' travel, which limit the capacity of the URT [1]. Taking Beijing URT as an example, although there are many URT lines in the central city, only 16.5% of Beijing's central city residents traveled by URT on weekdays in 2019, while the percentage of trips by car was 24.3%. The attraction of URT is clearly not comparable to that of cars. However, in Tokyo, Japan [2], in 2008, the share of travel by car was only 11%, and that of URT transit was 48%. Thus, the Beijing URT network has great potential to be tapped.

There is currently no standardized and explicit definition for bottleneck in URT networks. In this study, we consider key areas that limit the effectiveness of URT networks as bottleneck regions. Specifically, from the perspective of matching residents' travel demands with URT supply, we believe that within existing rail network stations, there are origin–destination (OD) pairs with high demand for multimodal travel but extremely low usage of rail transit for commuting. These areas indicate a mismatch between the coverage and connectivity of URT and urban development, and we refer to them as bottlenecks in this study. We first outline the basic principles for the deployment of URT stations in bottleneck areas. A method for identifying potential rail transit stations is then proposed based on the urban hierarchy road network and conventional bus line network. The route between potential station locations in the identified bottlenecks is determined based on the urban road network, in preparation for subsequent optimization of local network route generation. Building upon this, we introduce two objective functions, passenger flow and estimated engineering investment, as well as five constraints including station interval and terminal stations. A single-objective route generation model considering passenger distribution is constructed. The improved genetic algorithm (GA) is employed to solve this model and optimize the routes in the bottleneck areas for enhancement. The study thus provides methodological references for enhancing the role of URT, and solving the traffic congestion problems in large modern cities.

The rest of this paper is organized as follows. In the “Literature Review” section, we review the literature related to the optimization and transformation of transportation networks. We present our optimization model in the “Model Formulation” section and express it as a mathematical model. Then, in the “Algorithm” section, we propose a GA-based algorithm. The performance of our proposed model and solution method is evaluated through the cases in the “Case Studies” section. Finally, the we provide a summary in the “Conclusions” section.

2 Literature Review

The design and optimization of URT networks are important research areas within URT planning. Existing studies focus primarily on optimizing new URT networks before their construction and can be broadly classified into three categories. The first category of research involves adjusting and optimizing newly built network designs by constructing an evaluation index system for network schemes, but it lacks attention to network generation. The second category focuses on optimizing public transportation networks based on newly constructed URT lines. The third category involves optimizing train operation plans for newly constructed URT systems.

Many different approaches have been proposed in terms of network scheme evaluation index system methods. Liu and Qian [3] addressed the limitations of decision-makers’ rationality in practical situations by introducing a multi-objective method. They constructed a multi-objective decision-making evaluation index system for the selection of URT network schemes and calculated the comprehensive weights of each indicator using a synthetic method. Yang et al. [4] proposed an improved TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution) optimization decision model for URT network schemes based on the rational behavior of expert groups. Through proximity calculation, they ranked network schemes to achieve a comprehensive decision-making approach that ensures the rationality of traffic network planning for the entire community. Zhang et al. [5] presented an evaluation index weight calculation method based on the analytical hierarchy process (AHP) and entropy weight methods. They comprehensively considered the subjective factors of rich expert experience and the objective information of the indicators, conducting an evaluative analysis of the effectiveness of URT network planning.

With regard to adjusting public transportation networks based on newly constructed URT lines, some studies have focused primarily on optimizing the entire or partial bus networks. In the early stages of model construction, Ceder and Wilson [6] delineated the bus network design problem, summarized various approaches proposed to address this issue, and introduced two mathematical models for public transit network design based on passenger travel time. During the bus network design phase of the Rome transportation agency, Cipriani [7] discovered a linear relationship between passenger volume and speed. Pternea et al. [8] considered sustainable design goals and emission-free (electric) vehicles, introducing a direct route design model with route structure and direct control to address a sustainable public transit network design problem. Barahimi et al. [9] established a mathematical model to determine the capacity increase for links in a dual-mode public transportation network and proposed a particle swarm optimization algorithm with dynamic parameters to solve a multi-objective bi-level model. Nayeem et al. [10] proposed a population-based traffic network design model that prioritizes maximizing the number of passengers served, minimizing the total number of transfers, and minimizing the total travel time for all passengers. They presented a GA-based optimization method for bus network design. As research deepens, models become increasingly complex, incorporating more factors. López-Ramos [11] considered passenger travel time and operational costs, proposing an optimization method to simultaneously address network design and frequency-setting issues for rapid rail transit networks. In a study by Liang et al. [12], the problem was formulated as a multi-objective model with two conflicting objectives: minimizing passenger costs and reducing operator costs. In the solution of such problems, Chakroborty [13] analyzed the reasons for the problems with traditional methods in solving the urban transit network design problem (UTNDP). Emphasizing the effectiveness of GA-based procedures in solving the UTNDP, Walteros et al. [14] used GAs, simulated annealing, and taboo search to provide a set of feasible solutions for the design of a public transportation line network as well as frequency setting in a reasonable amount of time.

Some studies have investigated the optimization of train operation plans for newly constructed URT lines. Canda et al. have conducted a series of ongoing studies that provide a solid foundation for integrating multiple transportation planning steps. In 2016, a mixed-integer nonlinear programming model [15] was proposed to optimize the line frequency and capacity of the URT network with the optimization objectives of minimizing the average travel time and the operation, maintenance, and vehicle purchase costs. It addressed issues such as track allocation for different routes and passenger route assignment. In 2017, with network construction cost, vehicle purchase, operation, and expected income as optimization objectives, an optimization model [16] for URT network topology and line planning were established. The authors innovatively introduced an adaptive large-neighborhood search algorithm for model solving. In 2019, a two-layer optimization model [17] of railway rapid transit (RRT) was constructed, with the line network and line generation on the upper layer. The lower layer of the model considered the passenger flow distribution and train selection and set the line frequency. The lower model layer was linearized as shown in Appendix A, and then was solved using Gurobi optimization. The obtained line frequency and other parameters are used as the input for the upper model for an iterative solution. In addition, researchers in this field have actively explored the introduction of novel objective functions, constraints, and solution methods. Sun et al. [18] built a multi-objective model with maximum passenger volume and minimum passenger travel time for a single-corridor service area under a given rail transit route. In order to solve this model, a new algorithm based on GA was proposed. The feasibility and effectiveness of the proposed model and solution were verified by an example. Cadarso et al. [19] considered passenger demand, strategic costs, and uncertainty in network interruptions. They proposed a two-stage stochastic 0–1 programming model and a hybrid element heuristic algorithm to solve the optimization problem for the rapid transport network. Laporte et al. [20] proposed a network stepwise generation method with the optimization objectives of construction cost and population coverage. The main idea was to first determine an initial set of corridor strips through the planner's a priori knowledge, then fine-tune these line geometries by generating a set of non-dominated paths, and finally solve the bi-objective integer planning procedure to obtain the network solution. Zhao et al. [21] adopted a memetic algorithm to optimize the urban transit network. The mathematical model took the optimal route configuration and service frequency of the URT network as the objective function and aimed to minimize passenger travel costs and unmet passenger demands. The optimization algorithm was embedded with the local search operator based on the classical GA, which improved the computational performance of the algorithm.

In the present study, a comprehensive review of previous literature identified a lack of research on the actual operation and network optimization of existing URT systems. The optimization and transformation of existing URT networks require adjustments and capacity expansions under the constraints of network topology and passenger flow. Building upon the aforementioned methods for optimizing newly built networks, this study explores the construction of local network optimization models for existing URT systems. In order to meet passenger travel demands and reduce operational costs, two objective functions are proposed: passenger flow and estimated investment in line engineering. We also take into account practical constraints such as station interval and mandatory stops. Due to the complexity and specialization of the model, we convert the multi-objective functions into a single objective function, and the GA is improved through elite retention. By optimizing and solving the model considering network topology and passenger demand, local optimization and transformation of existing URT networks are achieved. A case study for optimizing the Beijing URT validates the effectiveness of this research model and algorithm. We also present some policy recommendations, which provide references for future adjustments and transformations of URT networks.

3 Model Formulation

In response to the issue of supply–demand imbalance between existing URT networks and the travel demands of city residents, this study constructs a nested passenger flow allocation model for URT line generation. To consider the mutual influence between passenger flow changes and line generation models, the network passenger flow allocation method is first introduced. Next, in order to construct a local line generation model, the selection process for potential URT stations is determined based on the principles of station selection. This model establishes connections between potential URT stations, aiming to obtain an optimal line considering multiple objectives including passenger travel demands and operational costs. To ensure the practical application of the generated new line scheme, this model not only considers basic constraints such as station interval and passenger volume but also takes into account constraints in practical situations such as OD points and mandatory stops. The following is a detailed introduction to the proposed model.

3.1 Network Traffic Allocation

The generation of a new feasible line scheme in the local line generation model means the generation of a new URT network. Different line schemes inevitably lead to different URT networks, making it possible to change the route choice of passenger flows on different line networks with fixed origins and destinations. When considering the response of passenger flows to changes in the topology of the network, it is necessary to perform a passenger flow allocation on the network after route optimization. Then the data such as urban rail traffic flow between stations in the network, linear distance between stations, and network distance between stations are updated, and these data are fed to the line generation model for calculating relevant objective functions and constraints, and determining the optimization direction of the model. Therefore, the route generation model must consider passenger flow allocation.

The existing research on traffic flow allocation is relatively mature. When the famous scholar Wardrop put forward the first and second theorems of traffic network equilibrium allocation, he began to study the traffic flow passenger allocation under traffic congestion with system analysis and equilibrium analysis methods. This enabled a great leap forward in the theory of traffic flow allocation. Based on classical methods, subsequent studies have improved the practicability and accuracy of the passenger flow allocation model by improving the generalized travel cost function [22], the selection method of effective path set [23], and the probability selection model of effective path selection [24].

Similar to the traffic flow allocation in the road network, passenger flow allocation in the URT network involves the allocation of the obtained passenger flow to each zone in the network according to certain rules and then determining the passenger flow for each zone in the network.

Considering the calculation complexity of the constructed line generation model and the granularity of routing and station locations in the planning stage of the network, the classical all-or-none passenger flow allocation method is adopted in this study. This means that all passenger flows on an OD will be allocated to the path with the lowest cost. The network distance traveled is used as the cost index, and the cost function can be expressed as

$$C_{k} = \mathop \sum \limits_{i \in I} l_{i},$$
(1)

where \(C_{k}\) denotes travel cost for path k, I represents the set of intervals that path k passes through, i is the ith interval in set I, and \(l_{i}\) is the network distance of the ith interval.

3.2 Potential Rail Transit Station Selection

In order to construct the local line generation model, after identifying the bottlenecks of the existing URT network, some potential URT stations should be selected in the bottleneck area.

The URT network is connected by lines between stations, and the impact of lines on the city is more than that of stations, because the station is only a local point area, while the line is a long strip of influence on the city, and the direction of the line and the layout of the station are mutually affected. Therefore, from the perspective of reducing the disturbance to the city and reducing the investment, it is necessary to restrict the route before determining the potential URT stations.

The general principle of line design is to lay the line as far as possible along the city road, and try not to cross the planning red line on either side. In order to reduce the disturbance to the built-up area, reduce the demolition, and control the project cost, the line plane should be laid along the road with a width of at least 20 m as far as possible. According to the construction standards of Chinese roads, the two-way expressways, main roads, and secondary roads with more than four lanes in large and medium-sized cities can be used as the routing constraints between alternative stations of URT lines.

Combined with the availability of data, this study selects city-level road network and conventional bus network to select potential rail transit stations in the region. The basic selection process is as follows:

  1. 1.

    Route determination: Routes should be preferentially laid on the urban road network with four or more vehicle lanes.

  2. 2.

    Preliminary determination: Locate stations at the intersections of expressways and trunk roads, trunk roads and main roads, and trunk roads and secondary roads that are four lanes or more. This is because locating the station at a road intersection with sufficient space allows for less demolition and disturbance to the surrounding area, as well as meeting the demand for travel in multiple directions.

  3. 3.

    Station determination: By visualizing the conventional bus network and counting the number of conventional bus line stations within 800 m around the preliminary alternative station in step 2, we ensure that there are certain travel demands around the selected station.

3.3 Model Assumptions

  1. 1.

    The potential stations are also ready for operation and the line generated by the proposed model is an extension, branch, liaison line of an existing line, or the first section of a new line.

  2. 2.

    All the generated line schemes adopt the same traffic organization mode, purchase the same number of trains, and use the same standard vehicles as the connected existing lines.

  3. 3.

    Assume that all new stations are underground stations.

  4. 4.

    Urban rail transit stations mainly include underground stations, ground stations, and elevated stations, and the costs of different laying methods are also different. In addition, the engineering environment of the station has a great impact on the cost, and the cost of different construction methods is also different. Since the model constructed in this work is mainly for the early stage of planning, it is assumed that a uniform estimate is used for all new stations ,and another standard uniform estimate is used for a few existing stations that need to be renovated.

  5. 5.

    All passengers travel according to the shortest path. When there are multiple shortest paths, one shortest path is selected at random.

3.4 Objective Functions

3.4.1 Passenger Flow

Generally, URT is built to serve the travel needs of citizens and relieve the pressure of urban road traffic. The volume of URT is the largest among all urban public transportation. Taking subway, for example, the one-way peak passenger volume can reach 30,000–60,000 per hour. Therefore, the size of passenger flow must be considered when building a new URT line. In addition, from the perspective of bringing convenience to residents’ travel, the greater the number of people covered by the station service area, the more residents can enjoy the travel convenience provided by URT.

In this study, the all-mode passenger flow between ODs is taken as the benchmark, and the URT passenger flow between ODs related to potential stations is obtained according to a certain proportion. According to the Annual Report of Beijing Transport Development in 2021 released by the Beijing Transport Development Research Institute, the travel share rate for URT in 2019 before COVID-19 was 16.5%. Here, 16.5% is taken as the URT share of all-mode passenger flow.

In this research, the number of people within 800 m around the station is taken as the substitute index of passenger flow, for two reasons. Firstly, the generated feasible line connections to the existing network in the model can result in changes in the network structure and the reconstruction of passenger flow distribution. Traditional methods for predicting passenger flow distribution in URT networks mostly rely on historical data patterns. However, after the network optimization, there is no historical passenger flow distribution data available for the new stations or between the new stations and existing ones. Moreover, the detailed data related to land use and socioeconomic characteristics associated with the lines are not easily obtainable. Therefore, accurately predicting passenger flow in the URT network after line optimization is a challenge. Secondly, based on research on the walkability of URT stations [25], residents consider a suitable walking distance to nearby stations to be around 750 m. This means that places that require a walking distance of more than 750 m are considered far, and in such cases, people tend to choose other modes of transportation or select alternative destinations. Considering the variability in the data used as the basis for calculation, such as height and walking speed mentioned in this paper, the suitable walking distance is appropriately relaxed to 800 m. Within this distance range, passengers are more likely to choose the station as their origin or destination for their travels. The mathematical form is shown as follows:

$$f_{1} = \mathop \sum \limits_{i = 1}^{{m_{K} }} P_{{S_{{{\text{A}}i}} }} + \mathop \sum \limits_{j = 1}^{{n_{K} }} P_{{S_{{{\text{E}}j}} }} ,$$
(2)

where \(f_{1}\) is passenger flow, \(S_{{\text{A}}}\) is the abbreviation for alternative station, \(S_{{{\text{A}}i}}\) represents the ith alternative station,\(S_{{\text{E}}}\) is the abbreviation of existing station, \(S_{{{\text{E}}j}}\) represents the jth existing station, \({ }P_{{S_{Ai} }}\) represents the number of people that can be covered within 800 m of the ith alternative station,\({ }P_{{S_{Ej} }}\) represents the number of people that can be covered within 800 m of the jth existing station, K represents the Kth new line scheme, \(m_{K} { }\) represents the number of alternative stations on the Kth optimized new line, and \(n_{K}\) represents the number of existing stations on the Kth optimized new line.

3.4.2 Project Construction Cost

URT engineering construction requires high project investment, and its design should be conducive to reducing investment and other costs without affecting safety and reliability and other conditions. URT is a huge and complex system engineering, which is composed of a series of related facilities and equipment, including lines, vehicles, stations, power supply system, communication, signal, fire protection, and environmental control systems. All systems cooperate to provide services for passengers. From the perspective of URT project budgets, the main costs of building a URT line include four types [26]: (a) station, section, communication, signal, track, and other engineering costs; (b) compensation for land acquisition, construction (structure) relocation, pipeline relocation, traffic relief, commercial compensation, and other project construction expenses; (c) reserve expense, also known as unforeseeable expense of project construction, which refers to the expense that should be reserved in advance for unpredictable expenses that may occur in the process of project implementation; and (d) special expenses such as vehicle purchase fee and loan interest. The statistical data for existing URT projects show that, generally speaking, the engineering cost of the project accounts for about 60% of the total cost, expenses for demolition and other engineering construction account for about 20%, reserve expenses account for about 3–4%, and special expenses such as vehicle purchase account for about 16%. The civil construction, installation of signal facilities, and compensation costs for land expropriation and demolition in the station area occupy about 80% of the budget estimate of a URT project, among which the budget estimate for the station constitutes 20% of the total budget estimate. Considering the availability of cost data and the small impact of special expenses such as reserve expense and vehicle purchase expense on investment, it is ignored here. The total project investment function is expressed as follows:

$$f_{2} = \sum \left( {\eta_{i} C_{{S_{{{\text{A}}i}} }} + \mu_{j} C_{{S_{{{\text{E}}j}} }} } \right) + \sum l_{a,b} {C_{{{\text{section}}}} + \gamma C_{{\text{stabling yard}}} + \lambda C_{{{\text{depot}}}} } ,$$
(3)

where \(f_{2}\) is the engineering construction cost; \(\eta_{i}\), \(\mu_{i}\), \(\gamma\), and \(\lambda\) are all 0–1 variables, that is, 1 if selected for construction or 0 otherwise; \(C_{{S_{{{\text{A}}_{i} { }}} }}\) represents the cost of constructing the ith alternative station; \(C_{{S_{{{\text{E}}_{j} { }}} }}\) is the abbreviation for existing station, \(S_{{{\text{E}}j}}\) represents the jth existing station; \(P_{{S_{{{\text{A}}i}} }}\) represents the cost of building a new interchange station near the jth existing station; \(l_{a,b} { }\) represents the interval line length between two adjacent stations a and b; \(C_{{{\text{section}}}}\) represents the construction cost of unit length interval; \(C_{{\text{stabling yard}}} { }\) represents the construction cost of a URT parking lot; and \(C_{{{\text{depot}}}}\) represents the construction cost of a depot on a URT line.

3.4.3 Linear Weighted Objective Function

Based on previous engineering experience, when the passenger flow is 10,000 and the construction cost is 1 billion yuan, the value of each objective function is similar. Therefore, this study chooses the linear weighting method to convert the two objectives into a single objective [27].

$$\min f = - z_{1} \times \left( {\mathop \sum \limits_{i = 1}^{{m_{K} }} P_{{S_{Ai} }} + \mathop \sum \limits_{j = 1}^{{n_{K} }} P_{{S_{Ej} }} } \right) + z_{2} \times \left( {\sum \left( {\eta_{i} C_{{S_{{{\text{A}}i}} }} + \mu_{j} C_{{S_{{{\text{E}}j}} }} } \right) + \sum l_{a,b} C_{{{\text{section}}}} + \gamma C_{{\text{stabling yard}}} + \lambda C_{{{\text{depot}}}} } \right),$$
(4)

where \(z_{1}\) and \(z_{2}\) represent the weight coefficients of the first and second objective functions, respectively.

3.5 Constraints

3.5.1 Station Interval Constraint

As the distance within the city is generally short, in order to ensure the attractiveness of the line, the energy consumption of the train, and the operational efficiency, the interval between stations should not be too large or too small.

The minimum allowed value of the distance between stations in the model is \(l_{\min }\), and the maximum allowed value is \(l_{\max }\), and \(l_{a,b}\) represents the network distance between station \(a\) and station \(b\). The mathematical form is as follows:

$$l_{\min } < l_{a,b} < l_{\max }$$
(5)

3.5.2 Single line length constraint

Similar to station intervals, the length of a URT line should not be too long or too short. In general, URT lines rarely have the problem of being too short. In addition, urban residents usually travel for short and medium distances, and URT lines have high investment in the early stage, so a URT line is usually not very long. According to existing studies, a reasonable length for a single line of URT ranges from 14.7 to 45.7 km and increases with the increase in passenger flow intensity. In the optimization model constructed in this study, the line upper limit value is set as \(L_{\max }\), and the total length of the line K is set as \({ }L_{K}\).

$$\begin{aligned} & L_{\min } < L_{K} < L_{\max } \\ & L_{K} = \mathop \sum \limits_{a = 1}^{{N_{K} }} l_{a,a + 1} \\ \end{aligned},$$
(6)

where \(N_{K}\) is the total number of stations for the Kth line scheme, a is the ath station, \(a + 1\) is the next station after the ath station, and \(l_{a,a + 1}\) is the distance between adjacent intervals (a, \(a + 1\)).

3.5.3 Quantity Constraint

The objective function value is different under the condition of generating different numbers of lines, so it is necessary to constrain the number of optimized lines as well. The number can be set to 1, 2, or the desired number.

$$N_{l} = 1 \;{\text{or}}\;N_{l} = 2\;{\text{or}}\;N_{l} = 3 \ldots,$$
(7)

where \(N_{l}\) is the number of lines.

3.5.4 Origin and destination constraint

In order to enable the optimized line to access the existing URT network, it is generally agreed that the stations at both ends of the optimized line should belong to the set of existing URT stations:

$$k\left[ 0 \right] \in \left\{ {S_{{\text{E}}} } \right\} \;\;{\text{and}}\;\;k\left[ { - 1} \right] \in \left\{ {S_{{\text{E}}} } \right\},$$
(8)

where \(k\left[ 0 \right]\) is the origin of the lines and \(k\left[ { - 1} \right]\) is the destination.

3.5.5 Obligatory Station Constraint

During actual line selection, there may be some important anchor stations to pass through. In order to be closer to the actual situation of line selection, the obligatory station constraint is added:

$$i \in \left\{ {k\left[ 0 \right],k\left[ 1 \right], \ldots ,k\left[ { - 1} \right]} \right\},$$
(9)

where \(k\left[ i \right]\) is the ith station of a line.

4 Algorithm

In view of the complexity and specialization of the model considering passenger flow allocation constructed in this study, the existing GA with elitist preservation (E-GA) is modified, and a method using E-GA nested with passenger flow allocation is proposed. The basic flow chart of this algorithm is shown in Fig. 1, and the steps are outlined in the following.

Fig. 1
figure 1

Basic process of E-GA nested with passenger flow allocation method

Step 1 Coding. According to the characteristics of the model constructed in this study, the number of candidate stations is set as the length of the gene, and the bases on the genes represent stations. Specifically, since there are 19 candidate stations, the length of an individual's genotype is set as 19. The 19 bases on the gene are encoded using integers from 0 to 19. Here, 0 indicates that no station is selected at that position, while an integer n between 1 and 19 indicates that station n is selected at that position.

Step 2 Initialization. Initialize the chromosome population through random generation, then calculate the fitness of each individual. Next, select N individuals to form a new population via the elitist selection algorithm. And by substituting the first parent population into the passenger flow distribution model, the data for passenger flow and network distance between ODs in the network are updated to calculate the value of the new objective function and the value of each constraint condition. A static penalty function is introduced to ensure that the generated solutions satisfy the constraints.

Step 3 Crossover and mutation. The new population can obtain a new parent population through operations of two-point crossover and uniform mutation. The crossover and mutation coefficients in this paper are selected based on empirical values. The crossover coefficient is set as 0.9, and the mutation coefficient is set as 0.1.

Step 4 Calculate the fitness of each individual in the new parent population, and then obtain the new parent population through elitist selection, two-point crossover, and uniform variation. And by substituting the parent population into the passenger flow distribution model, the data for passenger flow and network distance between ODs in the network are updated to calculate the value of the new objective function and the value of each constraint condition. A static penalty function is used to ensure that the generated solutions satisfy the constraints.

Step 5 Repeat step 3 and step 4 until the maximum generation is reached; then the algorithm is terminated.

5 Case Studies

5.1 Case Overview

We obtained a bottleneck in the Beijing URT network from relevant departments and selected its local area to verify the line generation model constructed in this study. The location diagram of the research area with respect to the whole URT network is shown in Fig. 2. In order to reduce the complexity of the model, the part of the interaction between Line 16 and Line 13 in the region is selected for example analysis. The research area is shown in Fig. 3.

Fig. 2
figure 2

Location of the study area with respect to the whole network

Fig. 3
figure 3

Schematic of the study area

The land use properties within the region are shown in Fig. 4, and the urban development maturity is relatively high. A number of high schools and universities are located in the area. Many technological parks and office areas are located along the highway. A dense residential area is present within the region. In addition, there are large shopping malls and a large integrated transportation hub with suburban railways, subways, and high-speed railways in the area. In general, the area is densely populated and has a high demand for travel.

Fig. 4
figure 4

Land use characteristics in the area

URT stations according to their location and role in the network can be divided into the intermediate stations and transfer stations. Transfer stations are larger than typical  intermediate stations, and the investment is usually larger. Referring to the existing document [26], the investment of an intermidate station is taken as 100 million yuan, the investment of a transfer station is taken as 150 million yuan, and the cost of intervals is 1 billion yuan per kilometer.

5.2 Selecting Alternative Sites

Based on the method above, alternative stations are selected through the urban road network and the conventional bus network. As shown in Fig. 5, there are 12 secondary arterial roads in the region. Dense white dots represent bus stations, red dots represent existing URT stations, and pink dots represent potential stations. Finally, an alternative set of 19 stations including five existing URT stations is formed. It should be noted that the Nongdananlu station is represented as 1 in the algorithm and analysis, which is different from the representation in the figure.

Fig. 5
figure 5

Distribution of bus stations and URT stations in the region

Demand for rail traffic at existing stations can be obtained from URT card-reading data. The rail traffic of the potential station is obtained by discounting the all-mode traffic around the station according to a certain percentage. The proportion can be obtained by referring to the URT trip share rate in the Annual Report on Transportation Development in Beijing 2021. To avoid the impact of the COVID-19 pandemic, the 2019 trip share of 16.5% is taken as the conversion factor for all-mode passenger flow to rail traffic. Finally, the average daily rail transit passenger flow for each candidate station is shown in Table 1.

Table 1 URT passenger flow in candidate stations

5.3 Results and Analysis

5.3.1 Algorithm Analysis

The proposed algorithm is a modified algorithm based on the GA. Therefore, the process of searching for the optimal solution has some similarities with the GA. If the evolutionary generation is too small, the algorithm does not converge easily, and the population is not yet mature. On the other hand, if the evolutionary generation is too large, the algorithm is already proficient or the population is too premature to converge, and it is meaningless to continue the evolution.

In order to explore the convergence problem of the single-objective line generation model and set up appropriate evolutionary generation, we first designed four experimental schemes of 100 generations, 200 generations, 300 generations, and 500 generations according to a weight of 1:1 and conducted several experiments. The evolution of the average objective function value of the population and the optimal objective function value of the population along with the evolutionary generations under the four typical schemes are shown in Figs. 6, 7, 8, and 9, respectively. The orange line represents the population optimal individual objective function value, and the blue line represents the average individual objective function value. The optimal objective function value and the optimal route under each scheme are shown in Table 2.

Fig. 6
figure 6

Evolution of optimal/average objective function value for each generation of population (100 generations)

Fig. 7
figure 7

Evolution of optimal/average objective function value for each generation of population (200 generations)

Fig. 8
figure 8

Evolution of optimal/average objective function value for each generation of population (300 generations)

Fig. 9
figure 9

Evolution of optimal/average objective function value for each generation of population (500 generations)

Table 2 Analysis of results under different evolutionary generations

By combining Table 2 and the evolution process of the optimal objective function value of the population and the average objective function value of the population under the four schemes, it can be found that when the evolutionary generation is set to 100 or 200 generations, the results converge around the 35th and 40th generations, respectively, and the optimization results are the same. When the generation of evolution increases to 300 or 500 generations, the evolution process converges in less than 20 generations, indicating that it is too premature to continue the evolution of the situation. Therefore, the number of evolutionary generations set to 100 or 200 is more suitable for the model in this research. Several experimental results show that the single-objective line generation model converges within 50 generations in most cases, and the latest convergence is within 60 generations. This indicates that the constructed E-GA with nested passenger flow allocation method is fast and effective for solving the model in this study.

The optimized “best individual” in the four cases is 2-10-12-13-16-19, which is translated as Malianwa–station 8–station 10–station 11–station 14–Xi’erqi, and the objective function value is −2.85582. Fig. 10 shows the optimized line in the study area.

Fig. 10
figure 10

Schematic of the optimal scheme for the model

5.3.2 Analysis of Different Weights

After determining that the reasonable evolutionary generation of the model constructed in this study is 100 or 200 generations, the optimal lines under different weights were studied. The weight settings are based on the practical requirements of the problem and can be readily adjusted according to the decision-maker's needs. As URT networks are essential for public welfare, this paper primarily considers two scenarios: maximum number of people served and minimum project cost are equally important, and maximum number of people served is more important than minimum project cost. For these two scenarios, we design two cases with weight ratios of 1:1 and 2:1. Generally, since the total weight sums to 1, the weights are set as 1/2 and 1/2, as well as 2/3 and 1/3. This takes into account the impact of different weights on the results, providing decision-makers with flexibility in choosing weights.

The evolution of the average objective function value of the population and the optimal objective function value of the population along with the evolutionary generation under the two schemes are shown in Figs. 11 and 12, respectively. The optimal objective function value and the optimal route scheme under each scheme are shown in Table 3.

Fig. 11
figure 11

Evolution of optimal/average objective function value of population with weight 1:1

Fig. 12
figure 12

Evolution of optimal/average objective function value of population with weight 2:1

Table 3 Optimal route scheme and objective function value under different weights

As can be seen from the above figure, regardless of which weight method is used, the algorithm converges within 100 generations. The optimal route optimization scheme for 1:1 and 2:1 weight is 2-10-12-13-16-19, and the location diagram is shown in Fig. 10.

5.3.3 Constraints on Different Required Sites

In addition, we designed two experimental schemes with weight of 1:1 and evolution of 100 generations, i.e., experiments with mandatory station constraints and experiments with fixed starting and finishing station constraints.

Station 7 is taken as the necessary station, and the proposed algorithm is adopted for calculation. The evolution diagram of population optimal/average objective function value is shown in Fig. 13. The optimal route scheme is 2-10-9-15-19, and the objective function value is 0.3807. The location diagram is shown in Fig. 14.

Fig. 13
figure 13

Evolution of optimal/average objective function value of population with site constraints

Fig. 14
figure 14

Schematic of optimal route scheme under the constraint of station 7

Nongdananlu station and Xi’erqi station are taken as the fixed OD constraints. The evolution of the population optimal/average objective function values is obtained, as shown in Fig. 15. The optimal route scheme is 1-3-12-16-19, and the objective function value is −1.51897. The location diagram is shown in Fig. 16.

Fig. 15
figure 15

Evolution of optimal/average objective function value of population with origin–destination constraints

Fig. 16
figure 16

Schematic of optimal route scheme under origin and destination constraints

5.4 Policy Suggestions for Resolving Bottlenecks

Resolving bottlenecks is beneficial for strengthening the overall URT capacity, improving transportation services, and enhancing operational management efficiency. This study proposes policy recommendations for resolving bottlenecks from three perspectives: network structure, planning and operations, and industrial layout.

5.4.1 Enhancing Transport Coordination

Expand the URT network: Consider adding new lines or branch lines to the URT network to cover underserved areas and ensure that more areas can benefit from URT services.

Coordinate public transportation services: Coordinate the public transportation system such as bus system with the URT network to meet the travel demands in the region and enhance the synergy between the two.

5.4.2 Focusing on Planning and Operations

Relevant planning, design, and operations departments should focus on the underlying causes of bottleneck areas. In future planning and design processes, these issues should be addressed to improve service quality.

5.4.3 Optimizing Industrial Layout

Arrange industries or chains of industries that are highly reliant on URT commuting on the same rail line. This helps increase rail passenger flow, especially in areas with low passenger volume on the URT network periphery, thus maximizing URT utilization.

6 Conclusions

As the URT network in metropolises like Beijing continues to expand, the focus has shifted from construction to renovation and improvement. The transformation and enhancement of existing URT networks have become urgent requirements. However, research on optimizing existing URT network operations lacks attention to actual operational conditions, and only a few studies have considered optimizing existing networks under the constraints of a large-scale URT system. This study proposes a method for selecting potential URT stations in bottleneck areas and constructs a line generation model to optimize and improve the routes in these bottleneck areas. Additionally, a GA method with nested passenger distribution is designed to solve this problem. Taking the existing URT network in Beijing as an example, the study analyzes the interconnections between local networks to optimize them. The results show that the algorithm converges within 100 iterations, indicating its rapid and effective performance. The generated solutions can meet regional travel demands and optimize the relationships within the URT network. To enhance the attractiveness of URT, this study also provides some policy recommendations.

Although this study has made certain progress in the theory and practice of optimizing existing URT networks, there are still some issues that need to be resolved due to a lack of knowledge and time. Considering the complexity of the research problem, some factors have been simplified in the modeling process. For example, assumptions have been made to simplify the analysis, population coverage within an 800-m range has been used as a proxy for passenger flow indicators, and a shortest path-based all-or-none passenger distribution method has been employed, without taking into account factors such as transfer times and congestion levels that may affect route selection. In the next phase of research, it is necessary to consider more comprehensive factors and indicators and establish more accurate models. It is also important to compare the performance of the proposed method with other algorithms.