Keywords

Introduction

Spatial data can be represented at different scales, which may facilitate map navigation and spatial analysis. Fully automated transformation of a map from one scale to a smaller scale is still a research topic of interest in the field of mapping and cartography (Li 2006). This study is concerned with selective omission in road network data, because road is one of the most important geographical features on a map, and selective omission (meaning the retention of more important roads) is an operation necessary for automated road network generalization.

Selective omission in a road network has been the subject of extensive studies. Some researchers analyzed road segments (Mackaness and Beard 1993; Mackaness 1995; Thomson and Richardson 1995) or road intersections (Mackaness and Machechnie 1999) for selection, because a road network is always stored in a database as intersections and segments. Some workers built strokes, which are defined as ‘a set of one or more arcs in a non-branching and connected chain’ (Thomson and Richardson 1999), and the selections were based on those strokes. The use of strokes makes possible the analysis of road networks based on the importance of individual roads, even in the absence of all other thematic information (Thomson and Brooks 2007). The importance of each stroke may be determined by various properties, such as road length, stroke connectivity (Zhang 2004a), degree, closeness, and betweenness centralities (Jiang and Harrie 2004). Now most researchers propose integrated indicators with various road properties. A method based on complex network analysis was proposed to estimate the hierarchies of urban road networks, in which the degree, closeness, betweenness and length are considered (Luan et al. 2012; Liu et al. 2014; He et al. 2015). An integrated approach was proposed in which different structures or patterns in a road network are considered (Li and Zhou 2012; Yang et al. 2013). Considering the connectivity and the geometric structure of the road network, the functionality of the stroke was the basis of road selection (Xu et al. 2012).

However, it is short of evaluations on the situation (aspects of network functionality and cartography) in which the new composite indexes fit. To our knowledge, no literature has focused on a comparative analysis on the composite indexes and finding a new composite index to define the importance of a road in view of network structure functionality, which is the main concern of this study.

The study is organized as follows. In Section “Linear Correlation Model of Road Ranking”, a brief description of the road ranking approaches, evaluations and study area is provided. Section “Experiment Result and Analysis” shows the experiment results. Finally, conclusions are drawn and some future work is given.

Methodology and Experiment Design

In order to achieve the comprehensive evaluation on the measurement of the road importance of a real road network, one typical road ranking approach and a new measure are used for the evaluation. In this section, the approach, measure and study area are briefly introduced.

The stroke-based approach was first proposed by Thomson and Richardson (1999). This approach has two steps, building the strokes and ordering the strokes. Building the strokes means concatenating continuous and smooth road segments into a whole. Ordering the strokes means ranking the strokes in a descending order from high to low importance. It is crucial to evaluate the importance of the stroke.

Dual Graph of the Road Network

In recent years, complex networks have been gradually applied to transportation and GIS, contributing to a deep analysis on the complexity and functionality of the structure of road network. Figure 1 shows different network topology structures of the same road network. Compared to the generated topology of the network (Fig. 1b), the dual graph (Fig. 1c) could be used to analyze the network structure and functionality further (Boccaletti et al. 2015). And the dual graph has an advantage of analyzing the connectivity and reliability of a road network and the importance of the road in the real road network.

Fig. 1
figure 1

Transportation network topology structure. a is the real road network; b is the generated topology of the network and; c is the dual graph of the network

Structural and Geometric Characteristics of the Road Network

Usually the network centrality is used as the index to analyze the structure characteristics of complex network. There are three basic indexes of centrality: degree, betweenness and closeness, as shown in Fig. 2. The clustering coefficient is an important index, which is also considered in this paper. So, four structure indexes and one geometric index are used to evaluate the importance of the stroke.

Fig. 2
figure 2

Example of centrality maximums

  1. (1)

    Centrality of degree is expressed as follows:

$$Degree = D_{i} = \sum\limits_{j = 1}^{n} {\delta_{ij} }$$
(1)

where, δ ij shows whether stroke i intersects with stroke j. If they intersect, δ ij is 1, otherwise 0. In the structure analysis on the road network, the greater the value of degree is, the more the road connections are. The degree plays a significant role in the entire road network.

  1. (2)

    Centrality of closeness is expressed as follows:

$$Closeness = C_{i} = 1/\sum\limits_{j = 1,j \ne i}^{n} {n_{ij} }$$
(2)

where, n ij is the number of the strokes included in the shortest path from i to j. The closeness is a global measurement that indicates the center of a city. High-rank roads should exhibit good accessibility to other roads. Compared with the centrality of degree, the closeness could further describe the accessibility of a stroke to its indirectly connected strokes. The greater the index value is, the more extensive range of services and the impacts of the stroke are, and the rank of the stroke is higher.

  1. (3)

    Centrality of betweenness is expressed as follows:

$$Betweenness = B_{i} = 1/\sum\limits_{j \ne k \ne i}^{n} {n_{jk} (i)/n_{jk} }$$
(3)

where, n jk is the number of the strokes included in the shortest path from j to k, and n jk (i) is the number strokes in the shortest path (i to j) passing the i. In the road network, the stronger the betweenness of the stroke is, representing more passing times on the shortest path, the more obvious influences like bridges and hubs are.

  1. (4)

    Clustering coefficient is expressed as follows:

$$CC_{i} = 2e_{i} /k_{i} (k_{i} - 1)$$
(4)

where, k i is the degree of the stroke, and e i is the number of triangles formed between any two neighbors. Different from the centrality of degree, the smaller the clustering coefficient is, the greater the functional role the node plays is in the network.

Length of the stroke is constructed by length of the continuous and smooth road segments, which is the geometric property. The longer length of the stroke, the higher the rank in the road networks.

Linear Correlation Model of Road Ranking

The five properties can only reflect some aspects of ranking of road networks in terms of structural and geometric characteristics. A series of correlation models of road ranking with structural and geometric characteristics have been built and could be used to comprehensively assess the ranking of road networks. This paper only utilizes a simple and basic linear correlation model of road ranking, which is expressed as:

$$Rank = \sum\limits_{i = 1}^{n} {\alpha_{i} X_{i} }$$
(5)

where, X i is the properties of the stroke, and α i is the weight factor of each property. Five properties of the stroke are used to rank the stroke. One is the basic geometric property, and the others are structural properties. The thematic property, which is unavailable, is not considered in the stroke-based approach. In order to keep a principle that the amount of information of the model could be maximized (Luan 2012), α i is defined as follows:

$$\alpha_{i} = E_{i} /\sum\limits_{j = 1}^{m} {E_{i} }$$
(6)

The information of the property can be obtained from E i , which is expressed as follows:

$$E_{i} = \sigma_{i} \sum\limits_{j = 1}^{m} {(1 - r_{ij} )}$$
(7)

The standard deviation of the property is defined as σ i . The r ij is the correlation coefficient between the properties.

Another way can explain α i is shown in Eq. 8. In the expression, μ i is the mean of each property, but the coefficient of variation is not considered.

$$\alpha_{i} = \sigma_{i} /\mu_{i}$$
(8)

Measurement

Using the linear correlation model of road ranking, geometric and structure properties could be chosen to integrate a new index of road ranking based on the stroke. However, it is not the best solution to choose all properties. Therefore, we should select some properties and combine them into a new index. Also, a method is needed to evaluate the index.

In complex networks, there are several ways of measuring the functionality of the networks. One key quantity is the average inverse geodesic length (Holme et al. 2004), which is a finite value even for a disconnected graph:

$$l^{ - 1} = \frac{1}{N(N - 1)}\sum\limits_{v \not\subset V} {\sum\limits_{w \ne v \in V} {\frac{1}{{d(v,w)^{{\prime }} }}} }$$
(9)

The road network can be expressed as a graph: g = (v,e), where v is the set of the vertices that stands the roads. Each edge connects exactly one pair of vertices and represents the connection relation of each road. The d(v,w) is the length of the geodesic between v and w. When we remove the high-rank roads, the functionality of the network could go downhill in a quick manner and then in a slower pace. So, the l −1 can be used to measure the indexes by removing the high-rank roads orderly.

Study Area

Three real road networks of varying patterns are tested (Fig. 3). After building the strokes, the road network of Chengdu (Fig. 3a) has 253 strokes, Hong Kong 484, and New York 933.

Fig. 3
figure 3

a Chengdu road network; b Hong Kong road network; c New York road network

Evaluations on Road Ranking Using Road Removing

Steps of Road Removing

The detailed description of the evaluations on road ranking using the road removing is as follows:

  1. (1)

    Building the strokes: Set the threshold of angle as 45 degrees and build the strokes for the road network. These are the basic operations of studying the road network.

  2. (2)

    Building the dual graph: Adopt the dual method to build the dual graph of the road network based on the strokes.

  3. (3)

    Calculating the values of properties: Obtain the values of degree, betweenness, closeness, clustering coefficient and the length of the stroke (the node on the dual graph).

  4. (4)

    Generating the integrated indexes of road ranking: Apply the properties to the linear correlation model to generate eight integrated indexes that may be used to rank the roads and calculate the indexes respectively. Here, the length and the degree of the stroke as the basic elements of road ranking should be considered. Table 1 lists the eight integrated indexes that adopt different properties.

    Table 1 Eight different indexes using five properties of the road
  5. (5)

    Removing the strokes in order: Sort the strokes by values of the integrated indexes respectively in descending order and calculate the l 1 by removing the stroke in descending order respectively. Plot the change curve of the l 1.

Experiment Result and Analysis

Eight indexes are utilized and three real road networks of different patterns are tested as shown in Fig. 4. Figure 4 shows the change curve of l −1 when the roads are removed in descending order. That is, the functionality of the real road network could be reflected by the curve. In order to give a clear observation, the result of each road network are represented by two graphs, where all index tests are included.

Fig. 4
figure 4

Eight indexes are tested by the l −1 for three different road patterns. a and b are for Chen Du; c and d for Hong Kong; e and f for New York

The results of the Chengdu road network (Fig. 4a, b) show that if clustering coefficient is added into the composite index, the l −1 does not go downhill when the roads of Ranks 20–30 are removed; while the DL and DLB perform very well. In addition, if the closeness is considered, the l −1 exhibits a jump after low-rank roads are removed. As shown in Fig. 4c and d, the similar phenomena occur in the Hong Kong road network. Since there are more roads in New York, an illusion may be given to us that New York behaves dissimilarly with Chengdu. However, if you zoom out Fig. 4e and f, you can find the same phenomenon.

To further test the validity of this indicator (DLB), a road selection test is carried out for the Chengdu road network in different selection proportions. Figure 5 gives five results of road selection. We can see that: (1) In each proportion, even a very small proportion, the selected network could maintain the topology connectivity of the original network and cover the whole range of the original road network; (2) In each proportion, the selected road network could keep the overall structure of the original road network. (3) As the selected proportion increases, the added roads are more reasonable with the density and the overall structure of the original road network being considered. And the hierarchy of the road network may be reflected. The Hong Kong and New York road networks involve the same phenomenon.

Fig. 5
figure 5

Road selection results at various selection ratios

Conclusions

Evaluating the road rank is not simply aggregating many properties of roads. We should also consider whether some properties need to be added into composite indexes. The result shows that the length and degree are the basis for evaluating the importance of roads. If the clustering coefficient is considered, composite indexes have adverse effects on the sorting of high-rank roads. While the closeness is added, the sorting of low-rank road is unreasonable. If the length, degree and betweenness are considered all together, the composite indexes perform best in the sorting of roads.

Furthermore, in order to enhance the performance of the road ranking method, it is of great value to take more road ranking approaches (not only the linear correlation model) and more characteristics of the road networks into account.