Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

It has become commonplace to think of ourselves as inhabitants of a “networked world.” The most obvious contemporary manifestation is the Internet, augmented in recent years by web 2.0 technologies that enable online social networks and by mobile technologies which maintain those connections even while people move through global transport networks from city to city and continent to continent. If “[t]he most profound technologies are those that disappear” (Weiser 1991, p. 94), then the Internet is by any measure profound, so much so that we only notice it – it only becomes visible – when it is unavailable. Of course, most networks are much older and more obviously geographical than the Internet. Significant infrastructure from transport systems and telecommunications to the supply of electricity and water is in the form of networks. Arguably, when it comes to understanding the aggregate geographies of the human world, whether from a social, economic, or cultural perspective, it is networks which structure, constitute, and organize those patterns.

Manuel Castells (1996) foresaw (but only just!) this development in his The Rise of the Network Society. Castells suggests that the network society alters social, economic, and cultural relationships, creating a global “space of flows” not directly associated with any particular location on the Earth’s surface. Less radically, other scholars have argued that a key determinant of the relative importance of world cities is not their geographical location per se but their location in economic, transport, social, and cultural networks. For example, Taylor et al. (2011), using network measures, rank the relative importance of cities to argue that London is an “alpha++” city outranking many more populous cities such as Tokyo (alpha+), Seoul (alpha), or Los Angeles (alpha). What makes London rank above other cities is not its particular individual characteristics or geographical location, but its position in relation to other cities, in other words its position in multiple overlapping networks of relationships between cities worldwide.

However, we are getting ahead of ourselves. Whether or not we consider a network analysis of world cities (or anything else) to be informative, before we can deploy such methods, we must define terms and develop measures. As in any field of quantitative study we need measures to enable repeatable descriptions of the objects of study and models to allow us to determine if the measurements we make of empirical cases are interesting. In the next section, basic concepts, definitions, and measures from graph theory are introduced. There follows a consideration of higher-level concepts of graph structure and associated measures. Throughout these sections pertinent aspects of spatial networks are discussed. Following from this, we introduce some models for spatial networks and comment on their properties. We then consider some significant findings from the rapidly growing literature applying these methods to real spatial networks. The article ends with some pointers to possible future directions.

Note that we do not consider here the numerous problems in computer science, operations research, and transport analysis (particularly traffic assignment and related problems) which are closely associated with the analysis of spatial networks. Interested readers should consult reference works in these fields and related chapters in this major reference work.

2 Spatial Networks and Graphs

In their stimulating and still relevant text Network Analysis in Geography, Haggett and Chorley (1969) follow Kansky (1963) in moving quickly from considering spatial networks to the analysis of mathematical graphs. Real spatial networks are complicated physical entities, with numerous elements, themselves often complex entities, such as multilane highways or airports with several runways (see Fischer 2004 for how these complications may be handled in a GIS setting). Our primary interest in the analysis of spatial networks lies in understanding how the network as a whole structures connectivity so as to centralize some locations, marginalize others, and, in general, differently position locations with respect to one another. It makes sense to strip away the messy complication of real spatial networks and work with the simpler, abstract representation of a mathematical graph.

We therefore begin with definitions from graph theory, which provides a foundation for the analysis of networks. A graph is a mathematical abstraction which can represent any set of elements somehow related to one another. Wilson (1996) provides a succinct introduction to the key terms and concepts discussed below. More advanced references delve into this field of discrete mathematics (Gross and Yellen 2006), which is fundamental to computer science (see, e.g., Jungnickel 1999), and is increasingly considered fundamental across all the sciences (Newman 2010).

2.1 Basic Definitions

A graph \( G \) consists of a finite, nonempty set \( V=\left\{ {{v_i}} \right\} \) of vertices and a finite nonempty set \( E=\left\{ {{e_i}} \right\} \) of distinct, unordered pairs of distinct elements in \( V \), called edges. The number of elements in \( V \), commonly denoted \( n \), is the degree of \( G \). The number of edges in \( E \) is often denoted \( m \). Figure 64.1 shows a typical small graph with \( V=\left\{ {a,b,c,d,e,\,f\!,g,h} \right\} \), \( E=\left\{ {ab,\,\,bc,\,\,cd,\,\,cf,\,\,de,\,\,dg,\,\,dh,\,\,eg,\,\,fg,\,\,gh} \right\} \), \( n=8 \), and \( m=10 \).

Fig. 64.1
figure 1

A typical graph

The edge \( {v_i}{v_j} \), or \( {e_{ij }} \), is said to join (more commonly link or connect) the vertices \( {v_i} \) and \( {v_j} \), and these vertices are considered adjacent. We say that \( {e_{ij }} \) is incident with \( {v_i} \) and \( {v_j} \) and that \( {v_i} \) is a neighbor of \( {v_j} \). The neighborhood \( N({v_i}) \) of \( {v_i} \), often written simply \( {N_i} \), is the set of vertices adjacent to \( {v_i} \). Two edges incident with the same vertex are adjacent edges. In Fig. 64.1, \( {N_b}=\left\{ {a,c} \right\} \), and edges \( ab \) and \( bc \) are adjacent.

Given the ubiquity of graphs (or networks), it should come as no surprise that there is considerable confusion around terminology, with different fields adopting different terms in various contexts. Vertices are often referred to as nodes and represent the entities in a network, such as cities, people, cell phone towers, airports, and railroad stations. Edges are commonly referred to as links or connections and represent relationships between nodes, such as movements of goods, services or people, existence of airline routes, and mutual intervisibility. We can think of graphs as mathematical abstractions of networks which exist in the real world, in much the same way that variables represent measurements of real phenomena – this is the distinction between vertices and nodes, edges and links, and so on. In this section, while introducing formal definitions from graph theory, we adopt the proper mathematical terms, but elsewhere may return to widely used synonyms (such as network, node, and link).

The structure \( G=(V,\,\,E) \) described so far is a simple graph, which has limited relevance to the representation of complicated real-world networks. We may also want to include cases where vertices may be joined to themselves by a loop \( {v_i}{v_i} \), and multiple edges may also be allowed if we drop the requirement that edges be distinct. More significantly, directed graphs (sometimes referred to as digraphs) consist of a set of vertices \( V \) and a set of arcs \( A \) or directed edges, each of which consists of an ordered pair of vertices in \( V \), implying directionality in the relationship between the vertices. This departure from the simple graph allows us to consider relationships where flows in each direction may be different (or even nonexistent in one direction), and is obviously an important consideration when we consider many real-world infrastructure or distribution networks.

Another variant on the simple graph is the weighted graph where each edge has an associated value or weight often denoted \( {w_{ij }} \), representing some attribute of the relationship between the vertices it joins. The most obvious attribute of interest in many geographical applications is the length of the edge, measured either as a distance or perhaps duration. More generally, edge weights may represent some cost associated with movement along the edge. Less obviously, but equally applicable, are edge weights that somehow represent the strength of the relationship between the vertices they join. The volume or value of trade between two countries and the number of flights daily between two airports are just two examples among many possibilities. In many cases, weights relating to the strength of a relationship between the incident vertices will reflect rates of flow or the capacity of the associated edges.

2.2 Vertex Degree, Graph Density, and Local Clustering

Even the limited graph theoretic concepts introduced so far allow us to develop useful descriptive measures of graph structure. Most obviously, the number of edges incident with a vertex is its degree denoted \( deg({v_i}) \) or \( {k_i} \). The average vertex degree is a useful summary measure of graph structure, given by

$$ \overline{k}=2m/n $$
(64.1)

since each edge is incident with two vertices. This measure is equivalent to Kansky’s \( \beta \) index (1963) differing only by a constant multiplier. The degree list of a graph is the set of vertex degrees often arranged in order of increasing degree. For the graph in Fig. 64.1, the degree list is \( \left\{ {1,2,2,2,2,3,4,4} \right\} \). In large graphs representing complex real-world networks, it is more useful to examine the degree distribution of the vertices, an aspect considered in more detail in later sections, although, as we shall see the degree distributions of many spatial networks are strongly constrained by their spatial embedding. If all vertices have the same degree \( k \), it is regular of degree \( k \), or \( k \) -regular. In practice, this is unlikely to occur in spatial networks, but may provide a useful benchmark or null model for assessment of how regularly structured is an observed network.

For a simple graph with \( n \) vertices, the maximum number of edges that could exist is given by \( \left( {\matrix{ n \cr 2 \cr }} \right)=n(n-1)/2 \). Comparing the actual number of edges in the graph to this maximum provides a measure of how strongly connected the graph is overall, namely, its density, \( \rho =m/\left( {\matrix {n \cr 2 \cr }} \right)={2m \left/ {n(n-1) } \right.} \). A graph’s density is the fraction of all possible edges which could exist which actually do exist. The graph in Fig. 64.1 has density \( 2\times 10/(8\times 7)=0.357 \). Because the number of possible edges in a graph grows approximately with the square of its degree, whereas in most spatial networks the number of edges grows roughly linearly with graph degree, most spatial networks have low density, and only a small proportion of all the possible connections exist. This generally arises either because of distance decay effects or due to planarity constraints. We consider both issues in more detail below.

Because of the constraints on overall graph density in spatial networks, it is often more interesting to consider the local density or clustering of a spatial network. This is a measure of how strongly connected the graph is in the neighborhood of each vertex. The clustering coefficient of a particular vertex is given by

$$ C({v_i})=\frac{{2{m_i}}}{{{k_i}({k_i}-1)}} $$
(64.2)

where \( {m_i} \) is the number of edges joining vertices in the neighborhood of \( {v_i} \). This is a direct localized equivalent to graph density and provides information about how well connected the network is locally. The distribution of the clustering coefficient in a network provides useful information about its structure. One interpretation is that it gives the probability, given that two vertices \( {v_j} \) and \( {v_k} \) are neighbors of \( {v_i} \) that \( {v_j} \) and \( {v_k} \) are themselves neighbors. Many spatial networks exhibit high clustering coefficients compared to nonspatial networks, which is unsurprising: if two vertices in a spatial network are neighbors, it implies that they are near one another, and if two vertices share a common neighbor, since they are probably also near one another, there is a high chance that they will be neighbors of one another.

2.3 Spatial Embedding and Planarity

Thus far, there has been no explicit consideration of the spatial aspect. Where we are concerned with spatial networks, vertices will have an associated spatial entity, often conveniently considered to be a point location, but potentially also a more complex spatial entity such as a region – for example, in a trade network, vertices may represent regions or countries.

Two types of spatial embedding of a graph are possible. The most obvious spatial networks are those where both vertices and edges are spatially embedded. Examples include transport and infrastructure networks, where the graph edges are physically realized in space, with direct implications for any associated weights or directional restrictions. Less obviously, spatial embedding of edges imposes a constraint on the overall network structure, that of planarity. A planar graph is one which can be drawn in two dimensions with no edges intersecting except at vertices on which they are both incident. For many infrastructure networks, this is approximately true, although bridges and tunnels in ground-transport networks are an obvious (but generally minor) exception. The planarity constraint significantly alters the overall structure of graphs, and we consider its implications in the following paragraphs.

A second form of spatial embedding is where vertices have associated spatial locations, but edges represent nonspatial relationships. An example is a spatially embedded social network. Individuals in the network have some spatial location – perhaps their home address – but edges might represent friendship or acquaintanceship relationships with no corresponding physical realization. A less obvious example is when the vertices in the graph represent spatially extended entities – such as metro lines – and edges represent a relationship such as “has an interchange with.” Such networks rarely constitute the primary object of analysis, although they may easily arise as dual graphs in some analyses. In considering such a network to be a spatial network, we implicitly assume that the distance between vertices (whether direct Euclidean distance or over intervening spatial networks – see below) has an effect on the probability of their existence. In other words, we expect that vertices more remote from one another are less likely to be joined than those that are closer together.

Where the distinction matters, we will refer below to fully embedded or vertex-embedded spatial networks, reserving the term spatial networks to refer to networks of either kind.

The fundamental difference between spatial networks with spatially embedded edges, which are (approximately) planar, and spatial networks not affected by this constraint lies in the limits it places on the overall density of the graph both globally and locally. A fundamental result is Euler’s formula for planar graphs

$$ n-m+f=2 $$
(64.3)

where \( n \) and \( m \) are the number of vertices and edges as before, and \( f \) is the number of faces or regions in the plane which the graph divides the space into. We consider the overall region in which the graph is embedded as face, so that for the graph in Fig. 64.1, \( f=4 \), that is, the whole space and the regions \( cdgf \), \( deg \), and \( dhge \). Euler’s result is easily proved when we consider starting from a graph consisting of one vertex and no edges, so that \( n=1 \), \( m=0 \), and \( f=1 \) when Eq. (64.3) clearly holds (see Fig. 64.2). Adding any edge while maintaining planarity, either (i) joins two existing vertices without intersecting an existing edge, so increasing both \( m \) and \( f \) by one, while leaving \( n \) unchanged, or (ii) adds a new vertex and joins it with an edge, increasing both \( n \) and \( m \) by one with no change in \( f \). In either case, Eq. (64.3) remains true.

Fig. 64.2
figure 2

How a planar graph grows as edges are added. Either the number of faces \( f \) (upper path) or the number of vertices \( n \) (lower path) must increase, but not both

Euler’s formula has important implications for the possible density of planar graphs. Since every face requires at least three edges, and each face can share an edge with at most one other face, we know that \( m\ge 3f/2 \). Combining this result with Eq. (64.3) we arrive at

$$ m\le 3n-6 $$
(64.4)

Combining this result with Eq. (64.1) tells us that the upper bound on the mean degree of a planar graph is \( \overline{k}\le 6 \). Kansky (1963, p. 18) recognizes this in providing alternative formulations of his \( \gamma \) index (equivalent to graph density) for planar and nonplanar graphs.

Understanding this result, it is much easier to understand why area maps are so distinctively structured and why the Voronoi tessellation and associated Delaunay triangulation exhibit such characteristic structure. Since the spatial network constructed from the adjacency relations of a set of polygonal regions must necessarily be planar, the mean number of neighbors of each region cannot exceed \( 6 \). In graph terms, this bound on the number of edges in a planar graph and thus on many spatial networks, means that almost all spatially embedded networks are sparse, with \( m\ll {n^2} \) and \( \rho \propto 1/n \) as \( n\to \infty \). In terms of local clustering, planarity also implies that any vertex with \( {k_i}>4 \) must have \( {C_i}\ < \ 1 \) since it is impossible for any graph of more than four vertices to be fully connected.

2.4 Shortest Paths, Distances, and Network Efficiencies

A particular sequence of edges \( \left\{ {{v_0}{v_1},\,\,{v_1}{v_2},\ldots,\,\,{v_{r-1 }}{v_r}} \right\} \) forms a walk of length \( r \). If the vertices in a walk are distinct, then it is a path, and a path that begins and ends at the same vertex forms a cycle. The distance between vertices \( {v_i} \) and \( {v_j} \), \( {d_{ij }} \) is the length of the shortest path among the set of all possible paths between \( {v_i} \) and \( {v_j} \). Any one path between \( {v_i} \) and \( {v_j} \) of length \( {d_{ij }} \) is a geodesic. The largest distance between any two vertices is the diameter of the graph.

In a directed graph, walks may only proceed in the direction of constituent arcs. In a weighted digraph, the length of a path is generalized from the above definitions by summing the weight of its constituent edges, and the distance between two vertices is the length of the shortest path, as before. Given that the weights associated with the arcs in each direction between any two vertices are not necessarily the same (that is, \( {w_{ij }}\ne {w_{ji }} \)), there is no guarantee that the distances between vertices in a directed graph will be symmetric. Note also that where graph weights do not represent a “traversal cost,” such as when they represent link capacities or trade volumes, numbers of people, or other similar measures, then it does not make sense to accumulate edge weights in this way.

The (graph) distances between any two nodes or between all pairs of nodes in a spatial network are of considerable interest, particularly in how they compare to the corresponding straight line (Euclidean) distances between the corresponding node locations. If a network provides a path between two nodes whose distance is close to the straight line distance, then the network is efficient for that particular journey. On the other hand, if the network requires a much longer and more circuitous path to be taken between two locations, it is inefficient. A measure of the network efficiency for a single particular path is the route factor defined by Black (see 2003), following Nordbeck (1964) as

$$ {Q_{ij }}=\frac{{{d_G}({v_i},{v_j})}}{{{d_E}({v_i},{v_j})}} $$
(64.5)

where \( {d_G} \) and \( {d_E} \) are graph-based and Euclidean distances, respectively, between locations \( i \) and \( j \). We can average this quantity over a particular node

$$ \overline{{{Q_i}}}=\frac{1}{n}\sum\limits_j\,{Q_{ij }} $$
(64.6)

or over the whole network

$$ \overline{Q}=\frac{1}{n(n-1)}\sum\limits_{{j\ne i}}\,\overline{{{Q_{ij}}}} $$
(64.7)

The route factor provides one perspective on the efficiency of a network as-built. Another perspective is to consider how cheaply a set of locations might be connected. A tree is a graph which includes no cycles, in which \( n=m+1 \). The minimum spanning tree of a set of vertices is the tree which minimizes the total weight of the edges in the tree, and when the weights relate to the cost of providing the associated node-to-node links, it represents the cheapest way to connect every node to every other. However, such a network is unlikely to be efficient from the point of view of a user of the network as measured using the route factor, since it will certainly involve many very circuitous shortest paths (see Fig. 64.3). Such a network is also vulnerable to failure, since losing just one edge will leave it disconnected. In practice, real networks will have more edges than the minimum spanning tree.

Fig. 64.3
figure 3

The minimum spanning tree of a set of points

3 Higher-Order Structure in Networks

The measures we have considered so far generally focus on the overall structure of a network in a general way or on the structure at a particular location. Summary measures or distributional properties of these measures are generally useful, but they often fail to reveal structural aspects of networks which arise out of the totality of all the spatial relationships in the network. In this section, we briefly consider measures of such higher-order structure.

3.1 Network Centrality

Consideration of distance in networks leads naturally to questions of the most accessible or central locations in the network. An obvious approach is to calculate the mean distance from a node to every other node in the network:

$$ \mathop{\overline{d}}\nolimits_i=\frac{1}{n}\sum\limits_j\,{d_{ij }} $$
(64.8)

where \( {d_{ij }} \) is the graph distance between vertices \( {v_i} \) and \( {v_j} \) as previously defined. Using this centrality measure, the most central node in a network is that with the minimum \( {d_i} \). While this is an obvious measure of network centrality, there are many alternatives. One which has received considerable attention in recent years, because of its close relationship to movement on the network and to how subregions of the graph are connected to one another (see below) is betweenness centrality. The betweenness centrality of a vertex \( {v_i} \) is the proportion of the shortest paths between all other pairs of vertices \( {v_j}\ne {v_k} \) in which \( {v_i} \) appears. If \( {g_{jk }}({v_i}) \) is the number of shortest paths from \( {v_j} \) to \( {v_k} \) in which \( {v_i} \) appears, and \( {g_{jk }} \) the total number of shortest paths from \( {v_j} \) to \( {v_k} \), then

$$ {c_{\mathrm{between}}}=\sum\limits_{{j\ne k}}\,\frac{{{g_{jk }}({v_i})}}{{{g_{jk }}}} $$
(64.9)

This measure has the nice property that it can be readily extended to edges also, simply being the proportion of all shortest paths on which each edge lies. Betweenness centrality provides an indication of the extent to which each vertex or edge has the potential to control movement or communication in the graph, assuming that there is “everywhere-to-everywhere” movement in the system. This measure is directly related to approaches that rely on random walk models. The most central vertices and edges measured in this way are those which will experience the most traffic when a population of random walkers move around the system.

3.2 Network Modules or Subgraphs

A class of measures which remains difficult to define precisely, but which has a clear intuitive interpretation, has recently come to the fore, as researchers attempt to determine how a network can be broken into cohesive subgraphs or regions, now most often referred to as communities. Fortunato (2010) provides a comprehensive overview of developments in this field. The general definition of a community is that the member vertices of a community are more strongly connected to one another within the community, than they are to vertices outside the community. This definition is not very precise, however. To take it to the extreme, we could argue that any joined pair of vertices are more closely connected to one another than they are on average to the rest of the graph (unless the graph is fully connected). A less extreme definition is to consider as communities, small, fully connected subgraphs or cliques (from their origins in social network analysis, see Wassermann and Faust 1994), but due to spatial constraints, cliques are unlikely in fully embedded spatial networks and so unlikely to be useful.

We obviously need a more flexible definition. Many have been suggested in the social networks literature (see Wassermann and Faust 1994, pp. 257–267), but most suffer from serious computational challenges in identifying them in graphs of any size, because of the exponential growth in the number of subsets of the vertex set \( V \) as graphs get larger. An important breakthrough has been in the development of more computationally tractable methods, beginning with the Girvan-Newman algorithm (Girvan and Newman 2002). Many of these methods are based on heuristic approaches, which successively remove edges of low betweenness centrality while repeatedly recalculating some measure of the quality of the resulting graph decomposition. Other methods aim to identify hierarchically nested communities and can consequently deal with very large networks with millions of nodes. Fortunato (2010) provides comprehensive details and references. It is notable that none of these methods are explicitly spatial, although where edge existence and/or weight is dependent on spatial proximity, this should not be a cause for concern.

The net result of these considerations is that the current working definition of a graph community is a circular one: graph communities are those subgraphs in a graph identified by a community-detection algorithm. This places considerable importance on the analyst’s ability to meaningfully interpret any communities so identified, a situation analogous with that in the cluster analysis of multivariate statistical data.

Vertex centrality and community structure for a typical spatial network are illustrated in Fig. 64.4. In Fig. 64.4a vertex centrality calculated from the total path length from each vertex to every other while considering the Euclidean length of the graph edges is shown, with the darkest shaded vertices the most central. As is often the case, the most central vertices are those that are most geographically central to the network, as we might expect. By contrast, in Fig. 64.4b the betweenness centrality based only on the network topology is shown. Because, in topological terms, the vertices to the west of the central part of this network provide a shortcut around the densely packed central region, these vertices are highlighted by this centrality measure. Finally, Fig. 64.4c shows a possible community structure in this network where seven distinct regions in the network have been identified based on their mutual connectivity.

Fig. 64.4
figure 4

Centrality and network communities illustrated. See text for details

3.3 Structural Equivalence

A final graph structural characteristic likely to be of interest is the concept of structural equivalence among vertices or edges. This concept is easily grasped, but precise definitions are mathematically challenging and detection of structurally equivalent sets of vertices remains difficult. The idea is that vertices that are structurally equivalent have similar relationships to the rest of the graph as one another, an idea whose origins lie in social network analysis where structural equivalence is related to social roles (Lorrain and White 1971). Graph communities are a special case of structurally equivalent vertices, which share the property of being in the same community. In a transport network, we might expect major junctions on arterial routes to constitute an equivalence class. However, pinning down how this concept can be realized in practice has proved difficult, and detection of structurally equivalent (or more usefully structurally similar) sets of vertices is computationally challenging – consider that while community detection can assume that the subgraphs of interest are connected subsets of the graph, no such assumption can be made for structural equivalence classes. While the concept of structural equivalence is an attractive one for the analysis of spatial networks, progress in this area remains rather limited.

4 Generating Networks: Spatial Network Models

While measures of network structure are important tools in improving our understanding of spatial networks, it is equally important to develop models for network formation. This was recognized by Haggett and Chorley in their coverage of [network] “Growth and Transformation” (1969, pp. 261–318), an extensive chapter and a very modern treatment. A recent review paper (Barthélemy 2011) provides a useful overview of many different spatial network models. Here we briefly review some of the available models and consider their general properties.

An important null model for any network is the Erdös–Rényi (E–R) model (see Erdös and Rényi 1960), which has been much studied. The E–R graph is generated as follows: create a set of \( n \) vertices, then consider every possible pair of vertices, and with probability \( p \) join them with an edge. Many of the expected properties of E–R graphs are well known. Of particular interest are the expected mean clustering coefficient and mean path length of vertices in the graph, once the network is sufficiently dense to be connected with no isolated clusters, an event which happens quite suddenly close to \( p=\ln n/n \). The expected clustering coefficient \( \left\langle C \right\rangle \) is given by \( p \) since \( p \) is the probability that any two vertices will be connected, and so is also the likely proportion of the neighbors of any vertex that will be connected. The expected shortest path length \( \left\langle d \right\rangle \) in the E–R graph is approximated by \( \ln n/\ln \overline{k} \).

4.1 Spatial Networks From Point Patterns

Perhaps the most obvious way to generate a spatial network is to begin with a point pattern and then to apply some geometric rules by which points are connected to one another (or not). This geometric graph model admits considerable variety in the outcomes depending on both the underlying point pattern and on the geometric rules applied. It also has the property that if we make the “rule” for joining nodes independent of the distance between them, then it is equivalent to the E–R random graph. More reasonable rules will be familiar from the construction of spatial weights matrices (see, e.g., pages 200–205 in O’Sullivan and Unwin 2010).

  • A distance criterion where two nodes are joined if they are closer together than some threshold distance. In Fig. 64.5a nodes nearer than \( 5 \) units apart are connected.

  • A nearest neighbor criterion where each node is joined to its nearest \( k \) neighbors. In Fig. 64.5b each node is joined to its \( 4 \) nearest neighbors.

  • An attribute-distance rule where depending on some attribute of the nodes and their separation distance they are joined or not. The simplest form of this rule is where the attribute is a radius of influence \( {r_i} \), and nodes \( i \) and \( j \) are joined if \( {r_i}+{r_j}\ < \ R \) where \( R \) is a threshold distance An example is shown in Fig. 64.5c. Such a simple rule might be meaningful in the context of trees in a forest influencing one another, but in regional science, a more likely formulation will be based on an interaction measure such as \( {m_i}{m_j}d_{ij}^{{-\alpha }} \) where the \( m \) values represent activity or population at each location and \( \alpha \) is a constant controlling the rate at which likely connection falls away with distance.

  • A pure geometric rule such as those governing the Delaunay triangulation or closely related Gabriel graphs (see Okabe et al. 2000) shown in Fig. 64.5d, e, respectively.

  • A global rule such as that governing construction of the minimum spanning tree, where the edges are those which together connect the network with the minimum total path length. An example is shown in Fig. 64.5f.

Fig. 64.5
figure 5

Examples of geometric networks as described in the text. The region is 40 units east–west and 60 north–south, and the point pattern is inhomogeneous Poisson with greater intensity at the center of the region. Point symbols in (c) are scaled so that if the circles of two points intersect, they are joined in the network

As is clear from Fig. 64.5, the various geometric rules produce quite different networks. An important distinction is that the Delaunay, Gabriel, and minimum spanning trees are planar, whereas the other networks are not. At the same time that they do no guarantee planarity, the distance and distance-attribute models may also leave some nodes unconnected. It is unusual for real-world spatial networks to leave isolated regions, and so it may be that hybrid models where a distance criterion is applied subject to a connectivity and/or planarity requirement are more reasonable in some cases. One approach is to start with a network substrate, such as the minimum spanning tree or Gabriel graph, and add additional edges according to a distance criterion.

4.2 Spatial Small Worlds

Small-world networks are so-called after the commonly encountered apparent contradiction in social networks that while they are locally highly connected (i.e., they have high clustering coefficients), they also have globally short paths (i.e., the mean path length is low). It is apparent that this result does not hold for graphs generated by the E–R model. As noted, E–R graphs become connected when \( p=\ln n/n \) when \( \overline{k}\simeq \ln n \). For a network with \( 1,000 \) nodes, this gives us \( \left\langle C \right\rangle =p=0.00691 \), \( \overline{k}\simeq 6.91 \), and \( \left\langle d \right\rangle \simeq 3.57 \). Increasing \( n \) to \( {10^6} \) reduces \( \left\langle C \right\rangle \) to \( 1.38\times {10^{-5 }} \) while \( \left\langle d \right\rangle \) only increases to \( 5.26 \). Clearly, although E–R networks are small worlds with short path lengths, they do not have the high local connectivity that makes this property in social networks surprising.

Watts and Strogatz (1998) presented an alternative network model that is both highly clustered locally, yet has short mean path lengths. Their approach is to start with a regular lattice and “rewire” it by breaking links and reconnecting them to other nodes selected at random from anywhere in the network. They show that only small numbers of rewiring events are necessary to dramatically reduce the mean path length in a lattice. Although Watts and Strogatz present their work for one-dimensional lattices, the basic idea is readily extended to more realistic spatial settings.

In two dimensions, a regular lattice is a grid of nodes with each node connected to its four nearest neighbors. The expected path length between any two nodes selected at random scales with \( {n^{1/2 }} \), and, in general in a \( D \)-dimensional lattice path lengths will scale with \( {n^{1/D }} \). Spatial small-world models, rather than rewire the lattice, typically introduce additional “shortcut” links with the probability of the shortcuts dependent on the distance between the vertices they join. The probability that a shortcut \( {e_{ij }} \) exists might be proportional to \( d_{ij}^{{-\delta }} \) where \( \delta \) is a parameter chosen in a particular case. As \( \delta \) is increased while holding constant the overall number of additional links added, the networks produced by models of this kind transition from random to small-world to regular lattice properties. This is readily understood in qualitative terms. For low values of \( \delta \), any length shortcut is equally likely – in effect, nodes are undifferentiated from one another – and the network has distance properties similar to a random network. High values of \( \delta \) heavily penalize the provision of longer shortcuts, leaving the lattice’s overall \( {n^{1/D }} \) distance-scaling property intact. Many transportation networks lie somewhere on this continuum, depending on how we incorporate shortcuts such as urban orbital highways, high-speed rail links, and airline routes into the more densely connected local transport network.

4.3 Growing Spatial Networks: Preferential Attachment

The examples above apply a connection or rewiring rule to a preexisting set of nodes. Arguably, a more realistic approach is to grow a spatial network from an initial individual node, by progressive addition of new nodes and edges, according to some rules governing how new nodes are attached to existing ones. Once again, the baseline case is a nonspatial network growth model known as the preferential attachment model, attributable to Albert et al. (1999), which has spawned a large literature on “scale-free” networks (see Caldarelli 2007). The basic idea is that nodes are added to a network and attach themselves preferentially to those nodes that already have larger numbers of connections. The resulting networks have heavy-tailed distributions of vertex degrees meaning that a small number of very strongly connected nodes dominate the network structure.

Planar networks clearly cannot exhibit such characteristics, and physical constraints in most spatial networks prevent an unrestricted preference for attachment to the most well-connected existing nodes. Preferential attachment models that consider space, require each new node to have a spatial location, and the probability of attachment to existing nodes is then a function of both the degree of existing nodes and the distance between the new node and the existing nodes; this is similar to the attribute-distance geometric models considered previously, but with progressive addition of nodes rather than an all-at-once calculation. Models of this general structure often produce the hub-and-spoke structures characteristic of many distribution networks. Again, as with spatial small-world networks, the rate at which the probability of a connection decays with distance is important in determining overall characteristics of the resulting networks.

4.4 Dual Graphs: New Graphs from Old

An important idea for models of networks is the dual transformation, whereby an initial graph is transformed to a new graph by switching between nodes and edges or (in a planar graph) between faces and nodes. The line graph \( L(G) \) of \( G \) is the graph whose vertices correspond to the edges of \( G \) and where two vertices are joined when their corresponding edges are adjacent. This dual transformation is shown in Fig. 64.6 and as in the case illustrated results in a denser graph with more variety in the vertex degree distribution than the original “primal” graph. The line graph dual transformation is often applied to the more obvious primal network representation of a system, such as the road intersection and segment network, because the richer structure provides more opportunities for insight into key features of the network. Figure 64.6bd shows a simple example. In a planar graph, a similar dual transformation entails treating each face of the graph as vertex in a new graph, and joining those vertices whose faces are adjacent in the original graph. This is the relationship between the familiar Voronoi tessellation and the Delaunay triangulation (see Okabe et al. 2000).

Fig. 64.6
figure 6

Different graphs from the same network: (a) the line graph dual transformation, white squares and gray lines are the original graph and black circles and dashed lines are the line graph; (b) a primal graph for a road network; (c) line graph from the same road network; and (d) named road graph from the same road network

4.5 Matrix and Adjacency List Representation of Graphs

Before closing the discussion of network analysis measures and models, it is important to note that even simple analysis of graphs requires careful consideration of how they are stored for computational purposes. There are two distinct approaches, which is preferable being largely a function of the graph density.

An obvious approach, given its close relationship to spatial weights matrices, is the graph adjacency matrix \( \mathbf{A}(\it G)={a_{ij }} \) where \( {a_{ij }}=1 \) if the edge \( {e_{ij }} \) exists and \( 0 \) otherwise. The row order of \( \mathbf{A} \) is unimportant, but the row and column ordering must be the same. The incidence matrix \( \mathbf{B} \) is an alternative matrix representation that records the incidence of edges and vertices and is an \( n\times m \) matrix where \( {b_{ij }}=1 \) if \( {e_j} \) is incident with \( {v_i} \), and \( 0 \) otherwise. A useful relationship between the incidence matrix and adjacency matrix is that

$$ \mathbf{BB}^{\it T}={{\mathbf{A}}^{*}} $$
(64.10)

where \( {{\mathbf{A}}^{*}} \) is the adjacency matrix, modified such that the elements in the main diagonal are equal to the degree of the corresponding vertex. Another useful transformation is that the adjacency matrix \( \mathbf{A}(\it L) \) of the line graph of \( G \) is given by \( {{\mathbf{B}}^T}\mathbf{B}-{\text2}{{\mathbf{I}}_{\it m}} \) where \( \mathbf{B} \) is the incidence matrix of \( G \) as defined above and \( {{\mathbf{I}}_m} \) is the \( m\times m \) identity matrix.

However, many spatial networks have very low densities. This makes adjacency matrices an inefficient representation because many \( 0 \) entries are stored even though they record no useful information. Therefore, an adjacency list representation is often more appropriate and simply consists of a list of all the edges in the graph. Depending on the implementation details, it may also be necessary for vertices to be explicitly listed, or for the number of vertices in the graph to be stored, before the edges are listed. Appropriate modifications of such data structures can readily accommodate directed or weighted graphs.

Many tools and software platforms used for the analysis of graph data make use of both matrix and adjacency list representations, and, given the sparseness of many spatial networks, it is important that an efficient sparse matrix implementation be available. For many analyses, the ability to quickly convert back and forth between (dense or sparse) matrix and adjacency list representations is necessary for efficient analysis.

5 Properties of Real-World Spatial Networks

Armed with the measures and models introduced above, it is possible to investigate the properties of real-world spatial networks. This remains an active area of research in many fields, and we restrict the discussion in this section to pointing to interesting examples and useful review materials, which enable a rapid introduction to specific fields.

5.1 Road Networks

Road networks are the most immediately obvious network encountered in everyday life. The primal representation of a road network, where vertices are the road intersections and edges are the road segments between them, generally exhibits rather uninteresting structure. Across large areas of a given city, the road network approximates to a two-dimensional lattice, and the range of vertex degrees is limited by geometry: it is unusual for road junctions to connect more than five or six road segments. However, when we range across larger scales, the road hierarchy in most regions introduces shortcuts in the form of highways with limited connections to the lattice of local roads. Spatial small-world networks capture this structure relatively well.

More interesting features of road networks may emerge when the primal representation is converted to the line graph dual or when named roads are treated as the units of analysis (i.e., the vertices in the graph) and intersections between named roads are the graph edges. Either of these transformations admits greater variety in the vertex degree distribution, and can enable the topologically most central roads to be identified, which may be of greater interest than more traditional approaches in some cases (see Jiang 2006). Perhaps the most interesting work in this area has been recent efforts to model a variety of urban street networks using rather simple models based on the preferential attachment principle but taking spatial constraints into account (Courtat et al. 2011).

5.2 Transport Networks

Transport networks cover a wide range of modes other than road-based. Relative to road networks, the most obvious feature of other modes is their point-to-point or station-connection structure. These features introduce greater potential for departures from planarity, particularly when airline networks and shipping routes are considered. In analysis of an extensive database of world airline routes, Barrat et al. (2004) demonstrate that this network has many interesting properties, including small-world characteristics, and distinctive scales related to regional and global service areas. This paper and others focusing on airline networks and the various findings in this area are well covered in the spatial networks review paper by Barthélemy (2011, pp. 13–17).

A comprehensive overview of developments in the analysis of all kinds of transport networks is provided by Rodrigue et al. (2009) where coverage extends beyond the more structural forms of analysis discussed in this chapter to cover how transport networks structure the regional and global economy and how they impact urban mobility and related issues of transport policy and planning. The grounding of earlier work in real-world histories provides a striking contrast with recent work in a journal special issue “Evolution of Transportation Network Infrastructure” (see Levinson 2009) where more exploratory analyses of different network growth models are highlighted.

5.3 Other Spatially Embedded Networks

It is appropriate given its importance in inspiring much of the recent explosion in work on networks to point to work on the Internet. This is representative of a wide range of work on infrastructure networks of all kinds. It is easy to forget that the Internet relies like other networks on physical plant of various kinds and that considerations such as efficiency of service provision, costs of installation, and vulnerability to disruption are critical concerns for the Internet backbone as they are for other infrastructure such as electricity and water supply. A comprehensive overview of network analysis work on the Internet is provided by Pastor-Satorras and Vespignani (2004). More geographically grounded perspectives that focus attention on the spatial embedding of Internet infrastructure focus on how the structure of the Internet relates to local geographical factors and to other infrastructure networks, often showing that places that are well connected by airlines, roads, and other systems tend also to be well provided with Internet connectivity (Malecki 2002). Once again, the interplay between exploratory analysis of overall structure and more grounded approaches is critical to progress in understanding in this field.

Finally, we briefly consider spatially embedded social networks, perhaps the fundamental building block of all the other networks considered. An excellent overview of how space and social networks may be mutually reinforcing and how these effects can be modeled is provided by Butts and Acton (2011). They strongly argue for the benefits of analysis that attends to both network aspects and spatial aspects. Among the most promising areas for future development in this field are coevolutionary networks (Gross and Blasius 2008) where network structures and the attributes of nodes and edges mutually influence one another over time, and the wide-ranging study of how processes such as disease spread or the diffusion of ideas occur on networks (see Newman 2010, pp. 627–676).

6 Conclusions

Many, perhaps most, features of the human world can be considered to be embedded in space and networked to one another at various spatial scales either (more or less) permanently or in constantly changing ways. This chapter has deliberately focused on basic concepts and models that are useful for the analysis of such networks, particularly emphasizing the rapid growth of ideas in the recently emerged “science of networks.” While much of this material has been developed in statistical physics and allied fields, it is apparent that the insights yielded by these approaches build on much earlier work on networks in geography and regional science, extending it and applying fundamental ideas to larger and more dynamic networks than before. Even so, claims that work in these areas heralds a new dawn for the social sciences (see, e.g., Watts 2007) seem overdone. On the contrary, it is probable that the best and most insightful work will continue to demand the application of measures, methods, and models from network science reviewed here, in combination with detailed, well-grounded empirical research on the development and structure of networks in specific contexts in space and time.