Abstract
There is a wide variety of computational experiments, or statistical simulations, in which regional scientists require regular and irregular lattices with a predefined number of polygons. While most commercial and free GIS software offer the possibility of generating regular lattices of any size, the generation of instances of irregular lattices is not a straightforward task. The most common strategy in this case is to find a real map that matches as closely as possible the required number of polygons. This practice is usually conducted without considering whether the topological characteristics of the selected map are close to those for an “average” map sampled from different parts of the world. In this paper, we propose an algorithm, RI-Maps, that combines fractal theory, stochastic calculus and computational geometry for simulating realistic irregular lattices with a predefined number of polygons. The irregular lattices generated with RI-Maps have guaranteed consistency in their topological characteristics, which reduces the potential distortions in the computational or statistical results due to an inappropriate selection of the lattices.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
Introduction
The complexity of computational experimentation in regional science has drastically increased in recent decades. Regional scientists are constantly developing more efficient methods, taking advantage of modern computational resources and geocomputational tools, to solve larger problem instances, generate faster solutions or approach asymptotics. The first formulation of the p-median problem provides a numerical example that required 1.51 min to optimally locate four facilities in a 10-node network [52]; three decades later, Church [16] located five facilities in a 500-node network in 1.68 min. As noted by Anselin et al. [7], spatial econometrics has also benefited from computational advances; the computation of the determinant required for maximum likelihood estimation of a spatial autoregressive model proposed by Ord [47] was feasible to apply for data sets not larger than 1000 observations. Later, Pace and LeSage [48] introduced a Chebyshev matrix determinant approximation that allows the computation of this determinant for over a million observations in less than a second. According to Blommestein and Koper [11], one of the first algorithms for constructing higher-order spatial lag operators, which was devised by Ross and Harary [54], required 8000 s (approximate computation time) to calculate the sixth-order contiguity matrix in a 100×100 regular lattice. Anselin and Smirnov [5] proposes new algorithms that are capable of computing a sixth-order contiguity matrix for the 3111 U.S. contiguous counties in less than a second.
An important aspect when conducting computational experiments in regional science is the selection of the way that the spatial phenomena are represented or conceptualized. This aspect is of special relevance when using a discrete representation of continuous space, such as polygons [34]. This representation can be accomplished through regular and irregular lattices; the use of one or the other could cause important differences in the computational times, solution qualities or statistical properties. We suggest four examples, as follows: (1) The method proposed by Duque et al. [21] for running the AMOEBA algorithm [1] requires an average time of 109 s to delimit four spatial clusters on a regular lattice with 1849 polygons. This time rises to 229 s on an irregular lattice with the same number of polygons. (2) For the location set covering problem, Murray and O’Kelly [46] concluded that the spatial configuration, number of needed facilities, computational requirements and coverage error all varied significantly as the spatial representation was modified. (3) Elhorst [24] warns that the parameters of the random effects spatial error and spatial lag model might not be an appropriate specification when the observations are taken from irregular lattices.Footnote 1 (4) Anselin and Moreno [4] finds that the use of regular or irregular lattice affects the performance of test statistics against alternatives of the spatial error components form.
However, returning to the tendency toward the design of computational experiments with large instances, there is an important difference between generating large instances of regular and irregular lattices. On the one hand, regular lattices are easy to generate, and there is no restriction on the maximum number of polygons. On the other hand, instances of irregular lattices are usually made by sampling real maps. Table 1 shows some examples of this practice.
The generation of large instances of irregular lattices has several complications that are of special interest in this paper. First, the size of an instance is limited to the number of polygons of the available real lattices. Second, the possibility of generating a large number of different instances of a given size is also limited (e.g., generate 1000 instances of irregular lattices with 3000 polygons). Third, as shown in Fig. 1, the topological characteristics of irregular lattices built from real maps change drastically, depending on the region from where they are sampled, which could bias the results of the computational experiments.Footnote 2
This paper seeks to contribute to the field of computational experiment design in regional science by proposing a scalable recursive algorithm (RI-Maps), which combines concepts from stochastic calculus (mean reversing processes), fractal theory and computational geometry to generate instances of irregular lattices with large number of polygons. The resulting instances have topological characteristics that are a good representation of the irregular lattices sampled from around the world. Last, the use of these instances guarantee that the difference in the results of computational experiments are not consequence of differences in the topological characteristics of the used lattices.
The remainder of this paper is organized as follows: Section “Conceptualizing Polygons and Lattices” introduces the basic definitions of the polygons and lattices and proposes a consensus taxonomy of the lattices. Section “Topological Characteristics of Regular and Irregular Lattices” presents a set of indicators that are used to characterize the topological characteristics of a lattice and shows the topological differences between regular and irregular lattices. Section “RI-Maps: An Algorithm for Generating Realistic Irregular Lattices” presents the algorithm for generating irregular lattices. Section “Results” evaluates the capacity of the algorithm to generate realistic irregular lattices. Finally, Section “Application of RI-Maps” presents the conclusions.
Conceptualizing Polygons and Lattices
A polygon is a plane figure enclosed by a set of finite straight line segments. Polygons can be categorized according to their boundaries, convexity and symmetry properties, as follows:
-
(i)
Boundary: A polygon is simple when it is formed by a single plain figure with no holes, and it is complex when it contains holes or multiple parts.Footnote 3
-
(ii)
Convexity: In a convex polygon, every pair of points can be connected by a straight line without crossing its boundary. A concave polygon is simple and non-convex.
-
(iii)
Symmetry: A regular polygon has all of its angles of equal magnitude and all of its sides of equal length. A non-regular polygon is also called irregular [19, 38].
A lattice is a set of polygons of any type, with no gaps and no overlaps, that covers a subspace or the entire space. Next, a more formal definition: A lattice is the division of a subspace S ⊆ R n into k subsets i ⊆ S such that ∪ i = S and ∩ i = ϕ, where ϕ is the empty set of R n [32].Footnote 4 There exist different taxonomies of lattices depending on the field of study. In an attempt to unify these taxonomies, a consensus lattice taxonomy is presented in Fig. 2. This taxonomy classifies lattices according to the shapes of their polygons, their spatial relationship and the use, or not, of symmetric relationships to construct the latticeFootnote 5:
-
(i)
According to the variety of the shapes of the polygons that form the lattice: Homomorphisms are lattices that are formed by polygons that have the same shape, and polymorphisms are lattices that are formed by polygons that have different shapes.
-
(ii)
According to the regularity of the polygons that form the lattice and the way in which they intersect, each vertexFootnote 6: Regular, lattices formed by regular polygons in which all of the vertexes join the same arrangement of polygons [57]; semi-regular, when the polygons are regular but there are different configurations of vertexes; and irregular otherwise [28].
-
(iii)
According to the existence of symmetric relationships within the latticeFootnote 7: Symmetric, when the lattice implies the presence of at least one symmetric relationship; and asymmetric otherwise.
-
(iv)
According to the symmetric relationship of translation: A lattice is periodic if and only if it implies the use of translation without rotation or reflection; it is aperiodic otherwise [57].
Table 2 shows an example of each category of this consensus taxonomy.
The topological characteristics of lattices are usually summarized through the properties of the sparse matrix that represent the neighboring relationships between the polygons in the map, the so-called W matrix [8, 12, 30, 41, 50].Footnote 8 This paper uses six indicators of which the first three are self-explanatory: The maximum (M n ), minimum (m n ) and average number of neighbors per polygon (\(\boldsymbol{\mu }_{1}\)). The fourth indicator, the sparseness (S), see Eq. (1), is defined as the percentage of ones entries with respect to the total number of entries in a binary W matrix (k 2, where k is the number of polygons in the lattice). The fifth indicator is the first eigenvalue of the W matrix (\(\boldsymbol{\lambda }_{1}\)). It is an algebraic construct commonly used in graph theory [26, 58] and regional science [12–14, 30] to summarize different aspects of the W matrix. The first eigenvalue, λ 1, is the maximum real value, λ, that solves the system given by Eq. (2), where I k is the identity matrix of order k × k. The last indicator, (\(\boldsymbol{\mu }_{2}\)), is the variance of the number of neighbors per polygon. It measures the spatial disorder of a lattice, and is given by Eq. (3), where W ij denotes the value of W in the row i and column j.
Within the field of regional science, lattices are frequently used with two purposes: First, real lattices can be used to study real phenomena, e.g., to analyze spatial patterns, confirm spatial relationships between variables and detect spatio-temporal regimes within a spatial panel, among others. Second, lattices can be used to evaluate the behavior of statistical tests [4, 45], algorithms [21] and topological characteristics of lattices [8, 40, 41]. In these cases, it is necessary to use sets of lattices that satisfy some requirements imposed by the regional scientist, e.g., the number of polygons, regularity or irregularity of the polygons and the number of instances. To accomplish this goal, it is a common approach to use a geographical base for real or simulated data polymorphism irregular aperiodic asymmetric (e.g., real lattices and Voronoi diagrams) or homomorphism regular periodic symmetric lattices (e.g., regular lattices). The following sections are restricted to the second use of lattices.
Topological Characteristics of Regular and Irregular Lattices
As stated above, regional scientists have the option of using regular or irregular lattices in their computational experiments. However, this section will show that there are important topological differences between these types of lattices.
Real lattices have topological characteristics that vary substantially from location to location. As an example, Fig. 3 presents the topological characteristics of lattices of different sizes (100, 400 and 900 polygons) sampled in Spain and the United States. Each box-plot summarizes 1000 instances. Important differences emerge between these two places: Spanish polygons tend to have more neighbors, are more disordered and their first eigenvalues are higher in mean and variance. These differences in the topological characteristics have direct repercussions on the performance of algorithms whose complexity depends on the neighboring structure [1, 21].
Regular lattices and Voronoi diagrams are also commonly used for computational experiments because they are easy to generate, there is no restriction on the size of the instances (the number of polygons in the map) and their over-simplified structure allows for some mathematical simplifications or reductions [9, 31, 61]. However, the topological characteristics of these lattices are substantially different from real, irregular lattices. These differences can lead to biased results in theoretical and empirical experiments, e.g., spatial stationarity in STARMA models [36], improper conclusions about the properties of the power and sample sizes in hypothesis testing [4, 45] and the over-qualification of the computational efficiency of the algorithms [1, 21], among others. Table 3 shows the topological differences between real maps, two types of regular lattices and Voronoi diagrams.
To illustrate the magnitude of these differences, we calculated the topological indicators (M n , m n , μ 1, μ 2, S and λ 1) for six thousand lattices of different sizes (1000 instances each of 100, 400, 900, 1600, 2500 and 3600 polygons) that were sampled around the world at the smallest administrative division available in Hijmans et al. [35]. As an example, Fig. 4 shows seven of those instances. These real instances are then compared to regular lattices that have square and hexagonal polygons and Voronoi diagrams.Footnote 9 To avoid the boundary effect on M n , m n , μ 1 and μ 2, the bordering polygons are only considered to be neighbors of interior polygons. Last, S and λ 1 are calculated using all of the polygons. Table 3 shows that regular lattices are not capable of emulating the topological characteristics of real lattices in any of the indicators: μ 2 = 0 and M n , m n , μ 1 = 4 and 6 (for squares and hexagons, respectively) are values that are far from those of real lattices. The values obtained for λ 1 and S indicate that regular lattices of hexagons are more connected than real lattices, while regular lattices of squares are less connected than real lattices. With regard to Voronoi diagrams, M n and m n indicate that they are not capable of generating atypically connected polygons. The values of μ 1 are close to real lattices. Finally, Voronoi diagrams are more ordered than real lattices, with values of μ 2 close to 1. 7, while real lattices report values of μ 2 that are close to 8.
RI-Maps: An Algorithm for Generating Realistic Irregular Lattices
This section is divided into two parts. The first part introduces an algorithm that generates irregular polygons based on a mean reverting process in polar coordinates, and the second part proposes a novel method to create polymorphic irregular aperiodic lattices with topological characteristics that are similar of those from real lattices.
Mean Reverting Polygons (MR-Polygons)
The problem of characterizing the shape of irregular polygons is commonly addressed in two ways, that is, evaluating its similitude with a circle [33] or describing its boundary roughness through its fractal dimension [10, 25].Footnote 10 In this paper, we apply both concepts in different stages during the creation of a polygon: The similitude with a circle to guide a mean reverting process in polar coordinates, and the fractal dimension to parameterize the mean reverting process.
Mean Reverting Process in Polar Coordinates
Different indexes are used to compare irregular polygons with a circle: Elongation ratio [60], form ratio [37], circularity ratio [44], compactness ratio [18, 29, 53], ellipticity index [56] and the radial shape index [17]. As Chen [15] states, all of these indexes are based on comparisons between the irregular polygon and its area-equivalent circle. Under this relationship, an irregular polygon can be conceptualized as an irregular boundary with random variations following a circle, which lead us to use a mean reverting process in polar coordinates to create irregular polygons.Footnote 11 A mean reverting process is a stochastic process that takes values that follow a long-term tendency in the presence of short-term variations. Formally, the process x at the moment t is the solution of the stochastic differential equation (4), where μ is the long-term tendency, α is the mean reversion speed, σ is the gain in the diffusion term, x(t 0) is the value of the process when t = 0 and {B t } t ≥ 0 is an unidimensional Brownian [43]. Equation (5) shows the general solution; however, for practical purposes, hereafter we use the Euler discretization method, which is given by Eq. (6), where ε t is white noise.
Algorithm 1 MR-Polygon: mean reverting polygon.
Algorithm 1 presents the procedure for generating an irregular polygon P in polar coordinates using, as a data generator, a mean reverting process (X t ). This algorithm guarantees that the distance between two points in X t , following the process X t , is equal to the distance between the same two points in P when following the process P counterclockwise. The purpose of this equivalence is to preserve the fractal dimension of X t in P. The angles Δ R and ϕ 1 in Algorithm 1 are the result of solving the geometric problem presented in Fig. 5. These two angles are used in Eq. (7) to establish the location of the next point in P. The points of P are denoted as P θ , with θ between 0 and 2π.
Because the process P depends on the parameters α, μ and σ, it is worthwhile to clarify their effect on the shape of polygon P: α is the speed at which the process reverts to the circle with radius μ and σ is the scaling factor of the irregularity of the polygon. High values of α and low values of σ generate polygons that have shapes that are close to a circle with radius μ. Finally, Δ t is utilized to preserve the fractal dimension of both processes, X and P, and determines the angular step, ϕ 1 (see Fig. 5).
MR-Polygon Parameterization
The process of establishing the values for α, μ, σ, Δ t and X 0 is not an easy task, and their values must be set in such a way that the shape of P is similar to a real irregular polygon. However, how do we determine whether a polygon P satisfies this condition? In this case, the fractal dimension appears to be a tool that offers strong theoretical support to assess the shape of a given polygon.
According to Richardson [53], the fractal dimension D of an irregular polygon (such as a coast) is a number between 1 and 2 (1 for smooth boundaries and 2 for rough boundaries) that measures the way in which the length of an irregular boundary L (Eq. (8)) changes when the length of the measurement instrument (ε) changes. The fractal dimension is given by Eq. (9), where \(\hat{C}\) is a constant.
In general, an object is considered to be a fractal if it is endowed with irregular characteristics that are present at different scales of study [42]. For practical purposes, D is obtained using Eq. (9) and is given by 1 minus the slope of log(L(ε)). This procedure is commonly known as the Richardson plot.
In almost all cases, the Richardson plot can be explained with two line segments that have different slopes; then, two fractal dimensions can be obtained: textural, for small scales, and structural, for large scales [39]. As illustrated, Fig. 6 shows a segment of the United States east coast taken from Google maps in two resolutions. Note that as the resolution increases, some irregularities that were imperceptible at low resolution become visible. In this sense, it can be said that irregularities at low resolution define the general shape and are related to the structural dimension, while irregularities at high resolution capture the noise and are related to the textural dimension. Regional scientists tend to use highly sampled maps, which preserve the general shape but remove the small variations. This simplification does not change the topological configuration of the maps [20]. Figure 7 presents the Richardson plot of the external boundary of the United States and its textural and structural fractal dimension.
In the field of stochastic processes, some approaches, which are based on different estimations of the length, have been made to characterize them through their fractal dimension. In our case, an experimental approach based on the fractal dimension of real polygons is proposed to select an appropriate combination of the parameters α and σ to generate realistic irregular polygons. Because our interest is on general shape rather than small variations, we account only for the structural dimension.Footnote 12 The parameterization process is divided into two parts: In the first part, the frequency histogram of the fractal dimensions of the real polygons is constructed. In the second part, we propose a range of possible values for α and σ, given μ, X 0, Δ t , which generates fractal dimensions that are close to those obtained in the first part. Because the level of the long-term tendency μ does not affect the length of X and because Algorithm 1 guarantees that the length is preserved, μ can be defined as a constant without affecting the fractal dimension. Hereafter, it is assumed that μ = X 0 = 10. The value of Δ t is set to be 0. 001 to properly infer both of the fractal dimensions.
The empirical distribution of the fractal dimension of the irregular polygons is calculated over a random sample of 10, 000 polygons from the world map used in Section “Topological Characteristics of Regular and Irregular Lattices”. The result of this empirical distribution is presented in Fig. 8a. To find the fractal dimension of the MR-Polygons, we generate a surface of the average dimensions as a function of the values of α and σ, which range from 0. 01 to 5 with steps of 0. 1 (Fig 8b). The resulting surface indicates that the fractal dimension is mainly affected by σ, especially when looking at small dimensions. Additionally, it is found that fractal dimensions close to 1. 23 are obtained when σ takes on values between 1. 2 and 1. 5, regardless of the value of α.
Figure 9 presents some examples of polygons using different values of α and σ. The polygons in the second row, which correspond to σ = 1. 5, produce irregular polygons that have a realistic structural fractal dimension. Additionally, in the same figure, both the original (gray line) and sampled (black line) polygons reinforce the fact that sampling a polygon does not affect the structural dimension. From now on, we will use sampled polygons to improve the computational efficiency.
Recursive Irregular Maps (RI-Maps)
Up to this point, we were able to generate irregular polygons with fractal dimensions that are similar to those from real maps. The next step is to use these polygons to create irregular lattices of any size whose topological characteristics are close to the average values obtained for these characteristics in real lattices around the world. For this step, we formulate a recursive algorithm on which an irregular frontier is divided into a predefined number of polygons using MR-Polygons. Our conceptualization of the algorithm was made under three principles: (1) Scalability: Preserving the computational complexity of the algorithm when the number of polygons increases; (2) Fractality: Preserving the fractal characteristics of the map at any scale; and (3) Correlativity: Encouraging the presence of spatial agglomerations of polygons with similar sizes, which is commonly present in real maps in which there are clusters of small polygons that correspond to urban areas.
Algorithm 2 presents the RI-Maps algorithm to create polymorphic irregular aperiodic asymmetric lattices with realistic topological characteristics. This algorithm starts with an initial empty irregular polygon, pol, (the outer border of the RI-Map) and the number of polygons, n, to fit inside. In a recursive manner, a portion of the initial polygon pol starts being divided following a depth-first strategy until that portion is divided into small polygons.Footnote 13 This process is repeated for a new uncovered portion of pol until the whole area of pol is covered. Because the recursive partitions are made by using MR-Polygons, we take the values of α from a uniform distribution between 0. 1 and 0. 5, and the values of σ from a uniform distribution between 1. 2 and 1. 5. Regarding μ, X 0 and Δ t , we use values proposed in Section “Mean Reverting Polygons (MR-Polygons)”. Finally, to guarantee the computational treatability of the geometrical operations, each polygon comes from a sampling process of 30 points. The main steps of the RI-Maps algorithm are summarized in Fig. 10.
Algorithm 2 RI-Map: recursive irregular map.
The RI-Maps algorithm has three unknown parameters:
-
p 1: Because each polygon is created by the MR-Polygons using a polar coordinate system that is unrelated to the map being constructed with RI-Maps, it is necessary to apply a scaling factor, \(\sqrt{\frac{p_{1 } \times area(pol)} {n\times \pi \times \mu ^{2}}}\), that adjusts the size of the MR-Polygon before being included into the RI-Map.
-
p 2: When a new polygon is used to divide its predecessor, its capacity to contain new polygons (measured by the number of polygons) is proportional to its share of the unused area of its predecessor. However, to enforce the appearance of spatial agglomerations of small polygons, the number of polygons that the new polygon can hold is increased with a probability of p 2.
-
p 3: When p 2 indicates that a new polygon will hold more polygons, the number of extra polygons is calculated as the p 3 percent of the number of missing polygons that are expected to fit into the unused area of its predecessor polygon. The number of extra polygons is subtracted from the unused area to keep constant the final number of polygons (n).
Table 4 illustrates the effect of the parameters p 2 and p 3 on the topological characteristics of RI-Maps. In the first row, p 2 and p 3 equal 0, which generates highly ordered lattices without spatial agglomerations. The second and third rows are more disordered than the first row and have spatial agglomerations, with those in the second row less frequent and evident than those in the third row. As will be shown in the next section, lattices in the third row are more realistic in terms of their topological characteristics.
To find a combination of p 1, p 2 and p 3 that generates realistic RI-Maps in terms of their topological characteristics, we use a standard genetic algorithm, where the population γ at iteration i, denoted as γ i, is formed by the genomes \(\gamma _{j}^{i} = [p_{j_{1}}^{i},p_{j_{2}}^{i},p_{j_{3}}^{i}]\), where \(p_{j_{1}}^{i}\), \(p_{j_{2}}^{i}\) and \(p_{j_{3}}^{i}\) are real numbers between 0 and 1, representing instances of p 1, p 2, p 3, which are denoted as phenomes. In this case, \(i \in \mathbb{N}\) between 0 and 20 and \(j \in \mathbb{N}\) between 0 and 100. To evaluate the quality of each genome’s fitness function, F(γ j i) is defined in Eq. (10), where θ is a set of polygons, ϕ k is the relative importance for a map of k polygons and f k (γ j i) is a function given by Eq. (11) that measures the average difference between the values of the topological indicators of real lattices and those values of RI-Maps formed by k polygons using the phenome γ j i. For the sake of simplicity, in Eq. (11), Ψ k = [M n , m n , μ 1, μ 2, S, λ 1] denotes the vector of real indicators and Ψ k (γ j i) denotes the vector for the mean values of RI-Maps with k polygons using γ j i. The superindex l is used in Ψ k l and Ψ k l(γ j i) to refer to the lth indicator in the real and simulated values, respectively. Finally, ns is the number of simulations to be generated with each genome.
The algorithm starts with an initial random population of 100 genomes to obtain the best four genomes. The subsequent populations are composed of two parts. The first 64 genomes are all of the possible combinations of the last best 4 genomes, and the other 36 genomes are random modifications of those 64 genomes. Because of the computational time required to evaluate Eq. (10), only lattices of 400 and 1600 were used, with an importance of ϕ 400 = 1 and ϕ 1. 600 = 2, respectively. The algorithm reached the optimal value after 13 iterations with p 1 = 0. 010, p 2 = 0. 050 and p 3 = 0. 315.
Results
Figure 11 presents a graphical comparison of the topological characteristics of real RI-Maps and Voronoi diagrams. The values for the RI-Maps were obtained from 100 instances.Footnote 14 The results show that RI-Maps have a maximum (M n ) and a minimum (m n ) number of neighbors that are very close to the values found in the real lattices. Regarding the average number of neighbors, both RI-Maps and Voronoi diagrams show similar values that are slightly higher than those observed in real lattices. However, because the number of neighbors is an integer value, it can be concluded for all three cases that the average number of neighbors is 6, which verifies the findings by Weaire and Rivier [59] in irregular lattices. Regarding μ 2, RI-Maps are a better approach to simulate the level of disorder found in real lattices. To facilitate the visualization, the values of S are reported as \(S {\ast}\sqrt{n}\). The results show that RI-Maps replicate the values of real lattices at any size, while Voronoi diagrams report higher values that tend to increase with the number of polygons. Last, RI-Maps have values of λ 1 that are closer to the values of real lattices, especially for large instances.
Table 5 presents the average and standard deviation of RI-Maps under the optimal parameters (p 1 = 0. 010, p 2 = 0. 050, p 3 = 0. 315) found in the previous section. This table completes the topological information on lattices presented in Table 3. Figure 12 shows the running times for different instance sizes using a HP ProLiant DL140 Generation 3 computer running the Linux Rocks 6.0 operating system equipped with 8 GB RAM and a 2.33 GHz Intel Xeon Processor 5140. The dotted line shows the x = y values, but its non-linear appearance is due to the quadratic scale used in the x-axis to improve the visualization of the plot. Although the reported times correspond to a non-optimized code, the plot shows an almost linear relationship between the problem size and the running time.Footnote 15
Application of RI-Maps
In this section, we present an example of the use of RI-Maps based on the computational experiments designed by Duque et al. [21] to compare the efficiency of the improved AMOEBA algorithm. To present the results, Duque et al. [21] proposed three computational experiments; one of them reports the running time of AMOEBA as the number of polygons of regular lattices increases. In this paper, we will run the same algorithm not only for regular lattices but also for real irregular and simulated irregular lattices (RI-Maps). First, we want to see whether the conclusions that are obtained for regular lattices can be extrapolated to irregular lattices. Second, we want to see if the results obtained with RI-Maps are also valid for real irregular maps. This experiment was executed with a HP ProLiant DL140 Generation 3 computer running the Linux Rocks 6.0 operating system equipped with 8 GB RAM and a 2.33 GHz Intel Xeon Processor 5140.
In the generated experiment, for each type of lattice, there were 30 instances with 1600 polygons. For each instance, we generated a spatial process that had four clusters, each using the methodology proposed by Duque et al. [21]. Last, the instances for real maps were obtained from sampling the same world map that was used in previous sections. Figure 13 presents the distribution of the running times obtained for each type of lattice, and Table 6 compares the distributions with the two-sided Kolmogorov-Smirnov test [27]. The null hypothesis of the Kolmogorov-Smirnov test is that the two samples come from the same probability distribution. Regarding the first question, it is clear that using a regular lattice for testing the AMOEBA underestimates the execution times. On the other hand, the distribution of the running times obtained for real maps and RI-Maps is statistically equal, which shows the benefits of using RI-Maps because it can automatically generate instances without limiting the maximum number of polygons.
Conclusions
This paper introduces an algorithm that combines fractal theory, the theory of stochastic processes and computational geometry for simulating realistic irregular lattices with a predefined number of polygons. The main goal of this contribution is to provide a tool that can be used for geocomputational experiments in the fields of exploratory spatial data analysis, spatial statistics and spatial econometrics. This tool will allow theoretical and empirical researchers to create irregular lattices of any size and with topological characteristics that are close to the average characteristics found in irregular lattices around the world.
As shown in the last section, the performance of some geocomputational algorithms can be affected by the topological characteristics of the lattices in which these algorithms are tested. This situation can lead to an unfair comparison of algorithm performances in the literature. With the algorithm proposed in this paper, the differences in the computational performances will not be affected by the topological characteristics of the lattices.
This paper also shows that the topological characteristics of regular lattices (with squared and hexagonal polygons) and Voronoi diagrams (commonly used to emulate irregular lattices) are far from the topological characteristics that are found in real lattices.
Notes
- 1.
See also Anselin [3], p. 51.
- 2.
Later in this paper, we show that the topological characteristics of Voronoi diagrams are far from those for an “average” map sampled in different parts of the world.
- 3.
Complex polygons do not refer to polygons that exist in the Hilbert plane [19].
- 4.
This paper focuses exclusively on bidimensional lattices (i.e., n = 2).
- 5.
An alternative category is proposed for lattices formed by fractal polygons that are informally defined by Mandelbrot [42] as rough fragmented geometric shapes that could be infinitely divided into scalable parts.
- 6.
Considering the vertexes to be all of the points of the lattice that intersect three or more polygons.
- 7.
There are three types of symmetrical relationships: Translation, when the lattice is formed by translating a subset of polygons; reflection, when there are axes of reflection in the lattice; and rotation, when it is possible to obtain the same lattice after a rotation process of less than 2π [51].
- 8.
See Anselin [3] for more information about this matrix.
- 9.
Each one of the six-thousand instances of Voronoi diagrams come from uniformly distributed points.
- 10.
Chen [15] established a relationship between these two approaches.
- 11.
Polar coordinates allow us to “wrap” a mean reverting process, with fractal characteristics, around a circle to build a polygon. But, it is important to clarify that once we get those coordinates, we draw them in the Cartesian coordinate system.
- 12.
To calculate the structural dimension, we use the EXACT procedure, which is devised by Allen et al. [2], with a small value for Δ t . Next, both of the dimensions were determined by using a k-means clustering algorithm over the cloud of points on the Richardson plot.
- 13.
There is not a proven computational advantage or theoretical reason behind the decision of implementing a depth-first strategy. We follow this strategy because it simplified the coding structure.
- 14.
The code to generate RI-Maps is available to the academic community as a utility within the module “inputs” in clusterPy V.0.10.0, an open source cross-platform library of spatial clustering algorithms written in Python [22]. To access the repository go to: https://code.google.com/p/clusterpy.
- 15.
Future research will be devoted to reduce computational time and exploit the possibilities of parallelization.
References
Aldstadt J, Getis A (2006) Using AMOEBA to create a apatial weights matrix and identify spatial clusters. Geogr Anal 38(4):327–343
Allen M, Brown G, Miles N (1995) Measurement of boundary fractal dimensions: review of current techniques. Powder Technol 84(1):1–14
Anselin L (1988) Spatial economtrics: methods and models, 1st edn. Kluwer Academic, Dordrecht
Anselin L, Moreno R (2003) Properties of tests for spatial error components. Reg Sci Urban Econ 33(5):595–618
Anselin L, Smirnov O (1996) Efficient algorithms for constructing proper higher order spatial lag operators. J Reg Sci 36(1):67–89
Anselin L, Bera A, Florax R, Yoon M (1996) Simple diagnostic tests for spatial dependence. Reg Sci Urban Econ 26(1):77–104
Anselin L, Florax R, Rey SJ (2004) Advances in spatial econometrics: methodology, tools and applications. Springer, Berlin
Aste T, Szeto K, Tam W (1996) Statistical properties and shell analysis in random cellular structures. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics 54(5):5482–5492
Bartlett MS (1975) Probability, statistics, and time: a collection of essays, 1st edn. Chapman and Hall, New York
Batty M, Longley P (1994) Fractal Cities: A Geometry of Form and Function. Harcout Brace & Company, London
Blommestein H, Koper N (2006) Recursive algorithms for the elimination of redundant paths in spatial lag operators. J Reg Sci 32(1):91–111
Boots B (1982) Comments on the use of eigenfunctions to measure structural properties of geographic networks. Environ Plan A 14:1063–1072
Boots B (1984) Evaluating principal eigenvalues as measures of network structure. Geogr Anal 16(3):270–275
Boots B (1985) Size effects in the spatial patterning of nonprincipal eigenvectors of planar networks. Geogr Anal 17(1):74–81
Chen Y (2011) Derivation of the functional relations between fractal dimension of and shape indices of urban form. Comput Environ Urban Syst 35:442–451
Church RL (2008) BEAMR: an exact and approximate model for the p-median problem. Comput Oper Res 35(2):417–426
Clark W (1964) The concept of shape in geography. Am Geogr Soc 54(4):561–572
Cole J (1964) Study of major and minor civil divisions in political geography. In: 20th International geographical congress, Sheffield, University of Nottingham
Coxeter HSM (1974) Regular complex polytopes. CUP Archive, Cambridge
Douglas D (1973) Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int J Geogr Inf Geovisualization 10(2):112–122
Duque J, Aldstadt J, Velasquez E, Franco J, Betancourt A (2011) A computationally efficient method for delineating irregularly shaped spatial clusters. J Geogr Syst 13:355–372
Duque JC, Dev B, Betancourt A, Franco JL (2011) ClusterPy: {Library} of spatially constrained clustering algorithms, {Version} 0.9.9.
Duque JC, Anselin L, Rey SJ (2012) the max-p-regions problem. J Reg Sci 52(3):397–419
Elhorst JP (2003) Specification and estimation of spatial panel data models. Int Reg Sci Rev 26(3):244–268
Frankhauser P (1998) The fractal approach: a new tool for the spatial analysis of urban agglomerations. Popul Engl Sel 10(1):205–240; New Methodological Approaches in the Social Sciences. www.persee.fr/doc/pop_0032-4663_1998_hos_10_1_6828
Garrison WL, Marble DF (1964) Factor-analytic study of the connectivity of a transportation network. Pap Reg Sci Assoc 12(1):231–238. doi:10.1007/BF01941256
George Marsaglia WWT, Wang J (2003) Evaluating Kolmogorov’s distribution. J Stat Softw 8(18):1–4
Ghyka M (2004) The geometry of art and life. Kessinger Publishing, Whitefish, MT
Gibbs J (1961) A method for comparing the spatial shapes of urban units. In: Urban research methods. D. Van Nostrand Company, Inc, Princeton, pp 96–106
Gould P (1967) On the geographical interpretation of eigenvalues. Trans Inst Br Geogr 42(42):53–86
Griffith D (1987) Toward a theory of spatial statistics: another step forward. Geogr Anal 19(1):69–82
Grunbaum B, Shephard GC (2011) Tilings and patterns. Dover Publications, New York
Haggett P (1977) Locational analysis in human geography. Wiley, New York
Haining R (2010) The nature of georeferenced data. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, pp 197–217
Hijmans R, Guarino L, Jarvis A, O’Brien R, Mathur P, Bussink C, Cruz M, Barrantes I, Rojas E (2005) DIVA-GIS 7.1.7. Available in http://www.diva-gis.org/
Hooper P, Hewings G (1981) Some properties of space-time processes. Geogr Anal 13(3):203–223
Horton R (1932) Drainage basin characteristics. Trans. AGU 13(1):350–361
Johnson DL (2001) Symmetries, 1 edn. Springer, London
Kindratenko V, Treiger B (1996) Chemometrical approach to the determination of the fractal dimension (s) of real objects. Chemom Intell Lab Syst 34:103–108
Le Caer G, Delannay R (1993) The administrative divisions of mainland France as 2D random cellular structures. J Phys I 3(8):1777–1800
Le Caer G, Delannay R (1995) Topological models of 2D fractal cellular structures. J Phys I 5(11):1417–1429
Mandelbrot BB (1982) The fractal geometry of nature. W.H. Freeman, San Francisco
Mao X (1997) Stochastic differential equations and applications, 1 edn. Horwood Publishing, Chichester
Miller V (1953) A quantitative geomorphic study of drainage basin characteristics in the clinch mountain area Virginia and Tenesse. Department of Geology, Columbia University
Mur Lacambra J (1992) Contrastes de autocorrelación espacial: Un estudio de Monte Carlo. Estadística Española 34(130):285–308
Murray AT, O’Kelly ME (2002) Assessing representation error in point-based coverage modeling. J Geogr Syst 4(2):171–191
Ord K (1975) Estimation methods for models of spatial interaction. J Am Stat Assoc 70(349):120–126
Pace R, LeSage, JP (2004) Chebyshev approximation of log-determinants of spatial weight matrices. Comput Stat Data Anal 45(2):179–196
Penrose R (1974) The Rôle of Aesthetics in Pure and Applied Mathematical Research. J Inst Math Appl 10:266–271
Peshkin M, Strandburg K, Rivier N (1991) Entropic predictions for cellular networks. Phys Rev Lett 67(13):1803–1806
Radin C (1993) Symmetry of tilings of the plane. Bull Am Math Soc 29(2):213–217
ReVelle C, Swain R (1970) Central facilities location. Geogr Anal 2(1):30–42
Richardson L (1961) The problem of contiguity: an appendix of statistics of deadly quarrels. General Systems Yearbook 6(13):139–187
Ross IC, Harary F (1952) On the determination of redundancies in socioeconometric chains. Psychometrika 17(2):195–208
Smirnov O, Anselin L (2001) Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach. Comput Stat Data Anal 35(3):301–319
Stoddart D (1965) The shape of atolls. Mar Geol 3(5):269–283
Tilley R (2006) Crystals and crystal structures, 1st edn. Wiley, Chichester
Tinkler K (1972) The physical interpretation of eigenfunctions of dichotomous matrices. Trans Inst Br Geogr 55(55):17–46
Weaire D, Rivier N (2009) Soap, cells and statistics—random patterns in two dimensions. Contemp Phys 50(1):199–239
Weeitty A (1969) On the form of drainage basins, 1st edn. Department of Geography, Pennsylvania State University, Pennsylvania
Whittle P (1954) On stationary processes in the plane. Biometrika 41(3):434–449
Acknowledgements
The authors wish to thank Colciencias (Departamento Administrativo de Ciencia y Tecnología e Inovación) for their financial support under the program “Jovenes Investigadores.” The authors also thank the Cyberinfrastructure Service for High Performance Computing, “Apolo,” at Universidad EAFIT, for allowing us to run our computational experiments in their supercomputer. The usual disclaimer applies.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Duque, J.C., Betancourt, A., Marin, F.H. (2018). An Algorithmic Approach for Simulating Realistic Irregular Lattices. In: Thill, JC., Dragicevic, S. (eds) GeoComputational Analysis and Modeling of Regional Systems. Advances in Geographic Information Science. Springer, Cham. https://doi.org/10.1007/978-3-319-59511-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-59511-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59509-2
Online ISBN: 978-3-319-59511-5
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)