1 Introduction

The problem of train routing concerns the choices of routes for scheduling trains on railway network. Train routing problem is a highly constrained network optimization problem that has confounded many traditional and modern optimization methods. The objective is to minimize the cost from a specified origin station to its destination station. It has been shown that the optimal route can be found with essentially local strategies, i.e. with strategies that do not require precise global information of the network (Kleinberg 2000; Watts et al. 2002). Random walk is one of them.

The theory of random walk has a long history and has been applied to solve numerous theoretical and practical problems (Spitzer 1976; Jespersen et al. 2000). The dynamics of random walk on network represents a powerful tool, it can be used to explore the connection between the network topology and functional properties of the network (Jespersen et al. 2000). The random walk is also interesting since it could be a mechanism of transport and search on network (Guimerà et al. 2002; Adamic et al. 2001).

Based on random walk, we propose an algorithm for solving the problem of optimal route. We improve the random walk search strategy, and make it suitable for searching the optimal route on railway network. The paper is organized as follows. We introduce the background in Section 2. In Section 3, we outline the principle of the design of new transportation plan. The proposed algorithm and numerical computation results are respectively presented in Sections 5 and 6. Finally, conclusions of the proposed approach are presented.

2 Background

The optimization model for train routing problem is defined over a network, where the nodes represent stations and the edges represent existing and possible train connections among these stations. Assad proposed a multi-commodity network flow model for train routing problem, where some levels of interactions between routing and yard activities are incorporated (Assad 1980). Haghani proposed a formulation and a solution method for a combined train routing and makeup (Haghani 1989). Rodriguez presented a programming model for the route of trains travelling through a junction (Rodriguez 2007). Zwaneveld considered the problem of routing trains through railway stations (Zwaneveld et al. 1996). Carey et al. considered the train routing problem for large, busy, complex train stations, which includes heuristics analogous to those used in practice by train planners and managers (Carey and Carville 2003). In the engineering and design, the route of trains is under the constraints of network structure and infrastructure systems. In Zhang et al. (2005), Zhang et al. presented a preliminary network flow equilibrium model to identify and understand the interactions among infrastructure systems. O’Kelly outlined a simple notation and analytical framework for optimizing flows within a hub (O’Kelly 2009). Traffic incident also impacts on the train routing. Daamen et al. described a tool that identifies route conflicts and the train numbers involved (Daamen et al. 2009).

Many optimization models use iterative algorithms to improve the performance of train routing. The first model proposed by Szpigel (1972). Later, Carey and Lockwood developed a basic model to serve as the foundation for further research (Carey and Lockwood 1995). Other authors developed the computer-aided dispatch systems (Petersen et al. 1986; Jovanovic and Harker 1991). In order to solve some complex problems, researchers have devised various algorithms for optimizing the train routing, such as the heuristic method based on Lagrangian relaxation (Keaton 1989), the combination of genetic and tabu search algorithms (Gorman 1998a), expert system (Lozano et al. 2002) and neural network (Martinelli and Teng 1996). However, with the time constraints for real-time optimization and the enormous size of the most railway networks, none of the existing methods provide ideal results (Blum and Eskandarian 2002).

The method for train routing can be divided into two groups: computer-aided simulation and mathematical programming. In practice, mathematical programming methodologies were not widely employed. For example, the most basic formulation of train routing is the line-planning problem, but it is still done by hand (Kleinberg 2000). Instead, computer-aided simulation methodologies were mostly used (Ghoseiri et al. 2004). However, Most computer-aided dispatch systems either attempt little or no optimization (Blum and Eskandarian 2002). An evident instance is that the Santa Fe railway was unable to implement the iterative methods because they were too time-consuming. Instead, their train routing relies on a simple heuristic (Gorman 1998b).

In this paper, an improved walk search strategy is proposed to solve the train routing problem. The proposed strategy is a computer-aided simulation algorithm, where the optimization for train routing is considered. The proposed algorithm has the following advantages: (1) Since the time is discrete, numerical implementation of the proposed search algorithm is relatively simple. Using some updated rules, the proposed algorithm can be executed with shorter computation time. (2) The objective function can be written as a flexible formula, such as some rules and conditions. This feature means that the proposed algorithm can be used to solve many practical optimization problems. (3) To our knowledge, this work explicitly shows this effect for random walk for the first time.

3 The design of new transportation plan

In general, a railway network is divided into many small areas. From one area to another area, there is a passenger flux per day. The departure station is called the origin station, and the arrival station is called the destination station. The origin-destination (OD) flux of passengers represents the number of passengers who depart from the origin station and arrive the destination station per day. Demand for railway transportation is usually expressed in terms of the OD flux. Given these demands, the planner must establish a set of operations that will govern the routes of trains. For every OD pair of traffic demand, the train dispatcher assign each train to a route which includes distinct combinations of the track sections and routes. Usually, OD flow is not only under the travel time restriction, but also is under the constraints of population and employment. But, in railway transportation, when workers design the train schedule and plan, train routing is mainly assigned by the travel time. In the proposed algorithm, we regard the travel time as the optimization object.

In practical railway transportation system, the design of new passenger transportation plan is usually divided into several steps (Assad 1980; Bussieck et al. 1997; Ghoseiri et al. 2004). At first, the passenger demand has to be assessed and analyzed on a given railway network. According to the passenger demand, planners determine the departure and arrival stations of trains, the types of trains and the number of trains etc. Second, planners determine the routes of trains on such a given network. In this step, the main problem is how to choose the routes of trains from specified departure stations to specified arrival stations, which is called train routing problem. Then, according to the routes of trains, planners determine the plan of train travelling. In this step, the main problem is how to determine the departure and arrival time of trains at each station.

From a specified departure station to a specified arrival station, there are many possible routes. The chosen route of trains is the optimal route which has the smallest transportation cost among the possible routes. The physical layout of the track, such as control signaling and switch, imposes constraints on train travelling. These constraints mainly influence the departure and arrival time of trains at each station. In most case, these constrains are not considered in train routing problem.

4 The search algorithm based on deterministic walk

All railway networks can be represented in graph data structures. Let G = (V,E) be an undirected graph, where V is the set of nodes and E is the set of edges. Each edge e m has a weight c m which is defined as the cost that unit train flow passes through the edge e m . A route between two nodes is an alternating sequence of nodes and edges. The set of routes can be represented as

$$R_{ij}=\{\rho_{ij}^{l}:l=1,2,..., \lambda(i,j\,)\}.$$
(1)

Where \(\rho_{ij}^{l}\) is the lth route between node i and node j, and λ(i,j ) is the number of the alternative routes between node i and node j. The edge correlation matrix can be defined as

$$a_{ij}^{ml}=\left\{ \begin{array}{ll} 1 &\quad \mbox{if $e_{m}\in \rho_{ij}^{l}$}\\ 0 &\quad \mbox{other}. \end{array} \right.$$
(2)

The cost carrying out the unit train flow from node i to node j is as follows.

$$c_{ij}^{l}=\sum\limits_{m \in E} a_{ij}^{ml}c_{m}.$$
(3)

Let N ij be the train flow from node i to node j. Since there can be more than one route between any two nodes, a binary decision variable is introduced for each allowed combination of a train and a route. The decision variable is defined as the follows.

$$x_{ij}^{l}=\left\{ \begin{array}{ll} 1 &\quad \mbox{if $N_{ij}$ selects the route $\rho_{ij}^{l}$}\\ 0 &\quad \mbox{other}. \end{array} \right.$$
(4)

The problem of finding a route with the minimum cost between two specified nodes can be expressed as a mathematical formula, which has the following form

$$min \sum_{i,j} N_{ij} \sum_{l} c_{ij}^{l}x_{ij}^{l},$$
(5)
$$s.t. \sum_{l} x_{ij}^{l}=1, \quad \forall i,j$$
(6)
$$x_{ij}^{l} \in \{0,1\}, \quad \forall i,j,l.$$
(7)

When a train travels from an origin station to its destination station, only the default route is chosen. Conventionally, the train routing problem is solved by listing out all the default routes. However, since the number of the stations is large, this approach leads to that the amount of work and its need of memory increase. A common approach to solve this problem is to use the graph search algorithm. In this paper, we use the deterministic walk search strategy to solve the train routing problem.

In principle, the flow between an OD pair can be loaded to multiple available routes. When the travel time is related to the flow, the shortest route could vary with time. In this case, the flow could be assigned to different route. However, when the travel time is not related to the flow, the shortest route does not vary with time. In this case, the flow is assigned to the shortest route. This means that larger OD flow is assigned to the shortest route. In railway traffic, the travel time is not related to the flow. When passengers travel form one station to another station, they always choose a shortest route where they can spend smaller travelling time. This means that on the shortest route, the OD flux of passengers from specified origin station to specified destination station is larger than that on other possible routes. Based on above ideas, in our method, walker chooses its neighbor node which has a larger OD flux from the chosen nodes to specified destination node with a high probability. Here the neighborhood node is the node which has a direct connection to the current node.

Using purely local information, such as the number of nodes and edges, we implement the walk search strategy that monitors the neighbor nodes in search of its destination node. In general, from a given origin node to a given destination node, there are a number of possible routes. The search process for walker will be repeated many times. In the search process, the walker moves according to the following deterministic rules: (a) Starting from node r, the walker is only restricted to move to one of its neighbor nodes. Moreover, the edge connecting the node r and the chosen neighbor node should not be visited in preceding time steps. If all such edges have been visited in the preceding time steps, the walker randomly choose one of them; (b) The probability the walker moving from node r to its sth neighbor node is P s  = f sj /( ∑  k f kj ), where f sj represents the OD flux between the node s and the destination node j, and k represents the kth neighbor node of node r. The search cost for this step is calculated by c m  = d rs /v. Here d rs is the length of the edge connecting node r and node s, and v is the speed that walker moves. In fact, the search cost c m represents the time that walker moves from node r to node s.

The probability P s ensures that the proposed algorithm can find optimal route with shorter search time. As discussed above, OD flow is larger on optimal route. When OD flow is larger, this probability is high. If this probability is high, the proposed algorithm can find the optimal route with shorter search time. This means that the probability P s reduces the search time. In addition, this probability mainly affects the search process. It ensures that the optimal solution can be found. The found optimal solution usually has smaller value of objective function. In other words, this probability ensures that the proposed algorithm can find the route which has smaller value of objective function.

The objective function is defined as that from a specified origin node to a specified destination node, walker has the smallest travelling time (or search cost). The executed process of the proposed algorithm can be described as follows.

  1. Step 1:

    initialization parameters. Circle time k c =0. k cmax is the maximum circle time.

  2. Step 2:

    displace one walker on the origin node i.

  3. Step 3:

    circle time k c =k c +1.

  4. Step 4:

    according to the deterministic walk rules described above, the walker moves from the origin node i to its destination node j. As the walker arrives its destination node j, a chosen route is found, or a solution is obtained.

  5. Step 5:

    if k c  < k cmax , go to Step 2.

  6. Step 6:

    record the best solution which has the smallest search cost among the obtained solutions.

In step 4, only one route is obtained among all possible routes. This means that the search process is under the constraint \(\sum_{l} x_{ij}^{l}\)=1. In step 6, we can range all the obtained solutions. Using this step, one (or more) optimization solution(s) can be found from a specified origin node to a specified destination node.

When we use the proposed algorithm to solve the optimization problem of train routing, from a specified origin station i to a specified destination station j, we need find one (or two) optimal route(s). Here a walker represents the train flow N ij . For the entire railway network, we usually need to determine many optimal routes among many specified stations. In this case, we can repeatedly execute the proposed algorithm many times to find all optimal routes for all specified OD pairs. The total search cost for the network is calculated by \(\sum_{i,j} N_{ij} \sum_{l} c_{ij}^{l}=\sum_{i,j} N_{ij} \sum_{l} \sum_{m}a_{ij}^{ml}c_{m}\). When a special event occurs, such as the accident, confliction or the visitation of Providence, rerouting for trains will be considered. In this case, we use at least two optimal routes between two specified stations.

5 Computation results

In order to test the efficiency of the proposed search strategy, we apply the proposed algorithm to find optimal routes of trains on a part of railway network. Here the speed v for walkers is set to be v = 200 km/h. Figure 1 depicts the part of the railway network. This part of the railway network includes 22 track sections and 14 stations. For the sake of convenience, these stations are respectively labelled by the numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 and 14. The lengths of the track sections are given in Table 1.

Fig. 1
figure 1

An example of railway network

Table 1 Lengths of track sections

On the part of the railway network, the OD flux of passengers is given in Table 2. Here the number, such as 81 in column 8, means that there are 81 passengers travelling from station 1 to station 8 per day. On the basis of the data given in Table 2, we determine the departure and arrival stations of the routes of trains on the railway network. Between two stations, if the number of passengers who directly depart from one station to other station is larger than 300 per day, we will design a route of trains between such two stations. This means that trains will directly travel between the two stations. This criterion usually is promulgated by government according to practical demand. For example, the OD flux of passengers between station 1 and station 5 is larger than 300, we will assign a route of trains between station 1 and station 5. The assigned route between station 1 and station 5 is expressed by \(1 \leftrightarrow 5\). The final determined results are as follows. \(1 \leftrightarrow 5\), \(1 \leftrightarrow 10\), \(2 \leftrightarrow 10\), \(3 \leftrightarrow 12\), \(4 \leftrightarrow 7\), \(7 \leftrightarrow 13\), \(8 \leftrightarrow 13\), \(10 \leftrightarrow 1\), \(10 \leftrightarrow 8\), \(13 \leftrightarrow 1\).

Table 2 OD flux of passengers

The proposed algorithm is suitable for computer programming. Using such an algorithm, we obtain the optimal routes of trains on the railway network. Table 3 gives the obtained results. The routes listed in column 3 represent the optimal routes of trains, and the costs listed in column 4 represent corresponding minimum search costs. For example, the route 1 → 3 → 5 listed in column 3 is the optimal route. Such a route has the smallest search cost among all possible routes from the origin station 1 to the destination station 5. In Table 3, most of routes of trains have the smallest search costs, but the route, i.e. 1 → 2 → 7 → 10, only has a smaller search cost. In the case study, from the node 1 to its neighbor nodes, the OD flow between the nodes 1 and 2 is much larger than that between the node 1 and other neighbor nodes. Using the proposed search algorithm, walker starting from the node 1 selects its neighbor node 2 with a higher probability. In this case, the obtained optimal solution is 1 → 2 → 7 → 10, which is a local optimal solution among all possible solutions. Moreover, it is a second optimal solution, which is only smaller than the global optimal solution 1 → 4 → 6 → 9 → 10.

Table 3 Routes of trains and search costs

Since the probability for the walker to step in one direction is greater than that in other directions, the walker propagates in such a direction with higher probability and eventually visit the nodes in this direction. Figure 2 presents the search process for walker. Here, the horizontal axis indicates the step that walker moves, and the vertical axis indicates the visited station. The chosen route for the walker is between the stations 13 and 1. In Fig. 2, the optimal route between stations 13 and 1 is a solid line with symbol “o”, and other dash lines represent different chosen routes. Here the numbers represent the search costs for each lines. From Fig. 2, it is obvious that the optimal route has the smallest cost among the chosen routes.

Fig. 2
figure 2

The search processes for walkers

We programme the proposed algorithm on a personal computer which has one processor. When the algorithm is executed, we record the CPU time (finding the optimal solution). Table 4 gives the average CPU time for finding optimal routes of trains. The computation experiment is made 10 times, the final result is obtained by averaging them. From Table 4, it is obvious that the optimal routes of trains are obtained with shorter computation time. This means that the proposed search algorithm is an effective tool for optimizing train routing.

Table 4 CPU time for optimizing train routing

In practical applications, k shortest paths algorithm is a common used algorithm (Eppstein 1998). Using the k shortest paths algorithm, the Dijkstra’s algorithm which is the best known algorithm up to now (Dijkstra 1959), is used to find the shortest route between any two specified nodes. For the purpose of investigating the performance of the proposed algorithm, we compare the proposed algorithm with the k shortest paths algorithm. In Fig. 1, there are totally 182 possible OD pairs. For each OD pair, we made one test respectively using the k shortest paths algorithm and the proposed algorithm. Among 182 measurements, we finally find 8 different optimal routes. The comparison results are listed in Table 5. Here we only list the different routes. From Table 5, we can see that the routes listed in column 3 are all the global optimal solutions, but the routes listed in column 4 are all the local optimal solutions. The costs of these local optimal solutions are only smaller than that of the global optimal solutions, i.e. these local optimal solutions are the second optimal solutions. The comparisons indicate that the performance of the proposed algorithm is very close to the classical k shortest paths algorithm. However, unlike the k shortest paths algorithm, the proposed algorithm can be used when weights have negative values.

Table 5 Different optimal routes

In order to further test the proposed algorithm, we compare the optimal routes obtained by two different search algorithms on a larger railway network which has 47 stations. The numerical computation results indicate that: (1) when we only need to find one optimal route between two specified stations, the CPU time using the proposed algorithm is close to that using the k shortest paths algorithm; (2) when we need to find two optimal routes between two specified stations, the CPU time using the proposed algorithm is obviously smaller than that using the k shortest paths algorithm. Table 6 gives part of the comparison results. Here each optimal route is called the second shortest route, which is only smaller than the shortest route. From Table 6, we can see that the CPU time listed in column 4 is obviously smaller than that listed in column 3. Moreover, using the proposed algorithm, the objective function can be written as a flexible formula. This feature ensures that the proposed algorithm can be applied extensively.

Table 6 CPU time for optimizing train routing

6 Conclusions

In conclusions, we propose a new search strategy to solve the problem of train routing on railway network. The proposed algorithm is based on deterministic walk. Since the proposed algorithm consists of some simple rules, it can be executed with shorter computation time. Remarkable, it can be integrated into a decision support system used by operators who make decisions to determine the optimal routes of trains.

It should be pointed out that the proposed algorithm depends on the amount and quality of OD flow data. On a large network, since the stations and routes are larger, a great amount of OD flow data is needed. In addition, the good performance of the proposed algorithm needs the OD flow data with high quality. In the proposed algorithm, the capacity constraint of each link (or route) is not considered. This limits the practical applications of the proposed algorithm. In our future work, we will extend the proposed algorithm where the capacity constraint is considered.

The proposed search algorithm is a heuristic algorithm. As a heuristic algorithm, its search process is to select a solution as optimal solution from possible solutions. Such an optimal solution possibly is not a globally optimal solution, but it surely is a locally optimal solution. In other words, there is no guarantee that such an algorithm will provide a global optimal solution over all time. In fact, it is an improved heuristic search strategy, and we believe that the proposed search strategy will gain new perspectives for route choice modelling. The proposed algorithm is also used to solve the optimal route problem in other fields, such as the city traffic and internet.