Keywords

1 Introduction

In mathematics, a weighted Voronoi diagram (WVD) in n dimensions is a Voronoi diagram for which the Voronoi cells are defined in terms of a distance defined by some common metrics modified by weights assigned to generator points. The multiplicatively weighted Voronoi diagram (MWVD) is defined when the distance between points is multiplied by positive weights [1].

The dominance region defined by subsequent Voronoi regions is generalized by the type of generator, such as points, lines, and polygons, various weights, plane constraints, and metric space used. Many researchers have focused on issues of WVD generation.

Hu et al. [2] used map algebra method to solve the problem of generating MWVDs for point, line, and polygon features on raster data. Wang et al. [3] presented a raster-based algorithm supported by the ArcInfo, which is capable to generate Voronoi diagrams for points, lines, and polygons in two-dimensional space. On the basis of Ref. [4], Dong [5, 6] provided a discrete algorithm for generating line segment WVD. However, the algorithm did not consider more complex generators such as line features with an arbitrary shape. Wu and Luo [7] proposed an algorithm based on cellular automata for constructing Voronoi diagram of complex entities in grid space. Dong [8] noted applications of MWVDs for points, polylines, and polygons using ArcGIS. Fan [9] presented a vector-based algorithm for generating WVDs, which, however, is also not suitable for complex spatial objects. Gong et al. [10] developed a vector-based algorithm to generate and update MWVDs for points, polylines, and polygons in C# and present several examples. However, like all other vector-based methods, when dealing with complex generators such as lines and areas, the algorithm needs to first decompose them into simpler elements, which may violate spatial integrity. Very recently, many researchers have used graphics hardware to improve the performance of computing Voronoi diagram. Rong et al. [11] presented a GPU-assisted Voronoi diagram algorithm for computing centroidal Voronoi tessellation (CVT). Xu et al. [12] proposed a raster-based algorithm to construct WVDs with GPU, which is capable to generate discrete WVDs in real time. Besides, Afsin et al. [13] proposed an approach of generating spatial index, Voronoi diagram, and efficient processing of a wide range of geospatial queries.

Although theoretical and computational aspects of WVDs have been extensively discussed, there are still some problems especially in case of complex generators. There are three major issues of generating WVDs: (1) Vector-based algorithms handle so many factors in calculation and storage, that it is difficult to generate WVDs for complex generators directly. (2) Raster-based approaches involve judging and computing distances for each grid, so it costs large amount of calculation and has to tolerate precision limitation. (3) Since sequential algorithms, especially raster-based approaches are inefficient to handle large-scale massive map data, parallel and high-efficiency methods for complex spatial features based on cloud computing are reasonable. Hence, considering types of generators and various weights, we present a raster-based parallel approach with Hadoop to generate MWVDs for polygons.

The remainder of this chapter is organized as follows. In Sect. 2, we first provide a review of relevant concepts of Voronoi diagrams, which build the foundation of proposed method in the chapter. In Sect. 3, raster-based parallel algorithm with pseudo-code is discussed. Section 4 presents experimental results to verify the performance and scalability of proposed approach. Potential applications of proposed approach are discussed in Sect. 5. Finally, Sect. 6 gives conclusions of the chapter and directions for future work.

2 Background: Voronoi Diagram

Voronoi diagram divides a space into disjoint polygons where the nearest neighbor of any generator inside a polygon is the generator of the polygon. In this section, we review the relevant concepts of the Voronoi diagrams [14].

2.1 Ordinary Voronoi Diagram

Consider a set of limited number of points, called generator points, in the Euclidean plane (in general, generators can be any type of spatial object). Every location in the plane can be assigned to the closest generator(s) with a certain distance metric. The set of locations assigned to each generator forms Voronoi polygon of that generator. The set of Voronoi polygons associated with all the generators is called the Voronoi diagram. The Voronoi polygon and Voronoi diagram can be formally defined as follows: Assume a set of generators \( G = \{ g_{1} ,g_{2} , \ldots ,g_{n} \} \), where \( 2 < n < \infty \) and \( g_{i} \ne g_{j} \) for \( i \ne j,\,i,\,j \in I_{n} = \{ 1, \ldots ,n\} \). For each generator, the set of Voronoi polygon given by

$$ V(g_{i} ) = \bigcap\limits_{j \ne i} {\{ p\left| {d(p,g_{i} ) < d(p,g_{j} )} \right.\} } $$
(1)

where \( d(p,g_{i} ) \) specifies the minimum distance between \( p \) and \( g_{i} \) and is called the Voronoi diagram generated by \( G \).

2.2 Weighted Voronoi Diagram

In many applications, not only the location but also the weight (or importance) and the spatial extent of a site should be taken into account. The influence of different generators on the surrounding is different, so the ordinary Voronoi diagram always cannot meet the needs of general spatial analysis. We need to improve and generalize the approach. WVDs can be divided into two types: MWVDs [15] and additively WVDs. For the former, the distance between points is multiplied by positive weights. For the latter, positive weights are subtracted from the distances between points. Based on the above concept, now we add the weight value of distance to ordinary Voronoi diagram.

Let \( g_{i} \in G \) be an element with positive weight \( \omega_{i} \). The weight distance \( d_{\omega } (g,g_{i} ) \) between p and g i is \( d_{\omega } (p,g_{i} ) = d(p,g_{i} )/\omega_{i} \). Then, the dominance of generators, called \( V_{\omega } (g_{i} ) \), can be represented by

$$ V_{\omega } (g_{i} ) = \bigcap\limits_{j \ne i} {\{ p\left| {d_{\omega } (p,g_{i} ) < d_{\omega } (p,g_{j} )} \right.\} }. $$
(2)

2.3 Weighted Voronoi Diagram for Polygons

In general, generators can be any type of spatial object, such as points, lines, and polygons. WVD for polygons [9, 16] is an important generalization of the ordinary Voronoi diagram in two sides of generator and weight. Most methods of computing Voronoi diagrams have some difficult in handling complex generators (not points). Many scholars have discussed the Voronoi diagram for lines or for polygons. Next, we focus on the WVD for polygons.

Assume a set of polygons \( P = \{ p_{1} ,\,p_{2} , \ldots ,p_{n} \} ,\;d(p,\,p_{i} ) \) specifies the minimum distance between p and p i , and then, the Voronoi Diagram generated by P can be represented by

$$ V_{\omega } (P_{i} ) = \bigcap\limits_{j \ne i} {\left\{ {p\left| {\frac{{d(p,p_{i} )}}{{\omega_{i} }}} \right. < \frac{{d(p,p_{j} )}}{{\omega_{j} }}} \right\}} .$$
(3)

3 Raster-Based MWVDs Parallel Algorithm with MapReduce

The general idea of the raster-based method [17] is to calculating raster distance of points and obtaining neighbor relationship between generators based on the idea of distance transformation. After map rasterization, all spatial features are translated into grid points. In the raster metric space, points, lines, and polygons are processed at spatial raster. All diagrams can be represented by a discrete grid lattice of size \( N \times N \), which gives \( N^{2} \) points in the plane. A unique identifier was placed in each grid cell. But there have some difference between three situations. Our approach focus on finding the raster point sets of polygons. Taking polygon feature as an example, our method increases the process of edge points extraction which is based on traditional Voronoi diagrams generation for points.

3.1 Raster-Based Weighted Voronoi Diagram Generation

After rasterization of primitive data, complex generators, such as polygons, can translate into lots of grid cells. Then, it makes huge computational burden and increases the time, especially if a large number of features are involved for generating Voronoi diagrams. While calculating the distance, we only need to get the minimum distance between polygons and other points. On the other hand, assume area features without internal voids, let a raster point specifies any grid cell except generators, the minimum distance can be the distance between the point and generator edge. Therefore, we need to get a simplified treatment of polygons to shorten the calculation time.

Edge points extraction of polygon namely judge whether a raster point is a boundary point or not. Our proposed approach uses a simple rule of criteria for determining pixels based on judging its four-neighbor points. If there are blank cells, the raster point can be a boundary point, otherwise delete it (Fig. 1).

Fig. 1
figure 1

Distance calculation from point to polygons

In the raster space, point \( P(m,n) \) is any other blank cell and \( A(i,j) \) is one raster point; the shadow part is its four-neighbor. If its four-neighbor is all not blank cell, so point A is not a boundary point, then the distance between \( P(m,n) \) and \( A(i,j) \) cannot be the minimum distance between the five distances. So it is unnecessary to compute the distance between internal points (like A) and blank cells. We just need compute the distance between boundary points and blank points.

The steps of the method are as follows:

  1. Step 1.

    Rasterization of primitive data. For every area generator, use a limited number of raster points to approach its region. And other regions are translated into blank cells.

  2. Step 2.

    Edge points extraction of polygon. Traverse all raster points of generators. Judge its four-neighbor points are blank cells or not. If it is, mark the point as a boundary point and record it.

  3. Step 3.

    Computing the weight distance and deciding each grid’s character. Calculate the weighted distance between every blank cell and set of boundary points. Label the character of boundary point of nearest distance until values of all blank cells are determined.

  4. Step 4.

    Output the final matrix.

Taking a discrete grid lattice of size \( 10 \times 10 \) as an example, Fig. 2 shows the evolution process of generating WVD for polygons. There are two area features as generators which are marked by 1 and 2 (Fig. 2a). After rasterization, they translate into 37 cells. The distance between P and polygon 1 can be represented by \( D_{1} (p,p_{1} ) = d(p,p_{1} )/\omega_{1} \). To get \( \hbox{min} \{ D_{1} (p,p_{1} ),D_{2} (p,p_{2} )\} \), there would be 37 calculation times. After edge points extraction of polygon (Fig. 2b), the times of computation is 24. The computation cost is greatly reduced by using this method. Let \( \omega_{1} :\omega_{2} = 3:2 \).

Fig. 2
figure 2

Weighted Voronoi diagrams generation for polygons

3.2 MapReduce Parallelization

MapReduce programming model is simply represented in two functions, namely a map function and a reduce function. The MapReduce job processes a key/value pair to generate a set of intermediate key/value pairs in the map function, while merges all intermediate values associated with the same intermediate key [18]. The two functions are written by the user. It makes programmers design parallel and distributed applications easily. In our Hadoop implementation, the generation of WVD is implemented in one MapReduce job (Tables 1 and 2).

Table 1 Algorithm: VoronoiMapper (key, value)
Table 2 Algorithm: VoronoiReducer(key, value)

The task of map function is to determine the minimum distance between blank cells and generators and deciding each grid’s character. We choose two files as input data, one is the original raster data file, another records boundary points. The map function takes as input a \({<}i,recordLine{>} \) pair in which \( i \) is the line number of the original matrix. Every recordLine deposit a row of grid cells. The map function must read record file for getting the location of generators in the original matrix. Finally, it outputs the \( {<}i,newLine{>}\) pair as intermediate output. The output value deposits a row of grid’s character.

In reduce function, the intermediate results are merged, sorted, and summed to output the final matrix. It takes as input a \( {<}i,recordLine{>}\) pair. The output of the function is the same as the input. Because the MapReduce model has ranked the record lines following keys in map function, so reduce function only need to output the final data.

3.3 Analysis

In practice, the MapReduce implementation of the method runs on multiple machines in parallel. When discussing the method complexity, we consider the fact that it runs in parallel and discuss the parallel method complexity. In the MapReduce job, the computational complexity of the associated reduction is as follows:

$$ (m \times n - g_{k} ) \times g_{k} \times O/{\text{nodes}} $$

where \( m \times n \) is the total number of grid cells, \( k \) is the number of generators, \( g_{k} \) is the total number of boundary points after edge points extraction of polygon, \( \text{nodes} \) is the number of the Hadoop nodes, and \( O \) is the computational complexity of weight distance computing.

4 Results and Discussion

In this section, we evaluate the performance impact of algorithm implementation and not its accuracy on our cluster system. The core idea of Apache Hadoop [19] is the MapReduce programming model. Our experimental hardware consists of nine nodes cluster: one namenode and eight datanode. Each node in the cluster is equipped with four quad-core 3.10 GHZ Intel Core(TM) i5-2400 processors, 4 GB of memory and 500 G of disk, runs Fedora15, and is connected with fast Ethernet. In this chapter, all experiments described are obtained using Hadoop version 0.20.2 and Java 1.7.0.04, while the data are stored with two replicas per block in HDFS.

4.1 Single Machine Environment Versus Hadoop Pseudo-Distributed Environment

When there are same data scales and same hardware configuration environments, we compare the time of generating WVDs under single machine environment and Hadoop pseudo-distributed environment. From the experimental results, the algorithm running in the single machine environment needs less time when the data scale is small. But when the scale of data increases to a certain extent, it reports out of memory and cannot complete calculation tasks, where the tasks can be treated successfully under Hadoop pseudo-distributed environment.

In our analysis, the control between nodes and task schedule take most part of resources when there is a small scale of data. So the time of calculation tasks is longer. When the scale of data increases, the single machine environment cannot meet the demand of computing because of many reasons such as the growth of memory resource consumption. However, the Hadoop platform can easily handle large datasets (Table 3).

Table 3 Contrast of execution time between single and parallel systems

4.2 Experiment Analysis for Cluster System

In the following experiments, we choose three gigabit-scale datasets as original datum: DS1, DS2, and DS3. The datasets are described in detail in Table 4.

Table 4 Experimental datasets

In light of three datasets, kill the certain quantity of datanodes every time and the running time of each experiment is shown in Fig. 3. We can see that the time decreased with the growth of node numbers. Increasing the number of nodes can significantly improve the processing ability of the cluster when the data scale is the same. The running speed is similar to linear growth with the increase of data nodes. It shows that the speed of generating WVDs for polygons is increased markedly on Hadoop distributed environment.

Fig. 3
figure 3

Running time

5 Case Study

In the former section, we verify the performance efficiency of parallel algorithm. Now we make a trial to use our approach in some practical application. Because of the limit of experiment condition, we only choose some part of maps as original map data. Our original datasets are the raster data are obtained by map rasterization using ArcGIS software.

5.1 Case One: Green Space Planning

According to the green space planning in Xi’an city, the city will vigorously develop the green space system to become a national garden city. The city greenbelt element can appear as green space blocks in various shapes and size. Park, shelter-forest, and water conservation district are strip or area features. The affection of greenbelts on the environment depends on varied factors such as area, size, tree species, and purification ability. Now we abstract greenbelts into polygons and assign generators’ weight. Then generate the WVDs for greenbelts for the decision of green construction.

Our chapter takes greenbelt in Xi’an as an example and analyzes the sphere of influence of greenbelts. Fig. 4a shows the original map data from part area of Xi’an city. It chooses nine greenbelts as generators. Set weight according to the size of greenbelts, we can get the raster WVD in Fig. 4b. Figure 4c is the final vector diagram. It marks off the sphere of influence for every greenbelt. Regions A and B are located at the interface of several greenbelts and in a position of the edge. They are weakly influenced by the greenbelts. So we suggest increasing greenbelt in these regions.

Fig. 4
figure 4

WVDs for polygons of greenbelts

5.2 Case Two: Optimal Path Planning Problem

Given some obstacles in space or in a plane, the retraction method for motion planning uses the Voronoi diagram to determine whether there exists an optimal path from an initial posit onto a final position. If obstacle can be approximately regarded as particle, the safest path can follow Voronoi edges. If obstacle cannot be approximately expressed by particle, the expansion of Voronoi diagram can be employed (Generators are lines, polygons, or polyhedrons). On the other hand, for different obstacles, they may have different criticality. So their weights are different. For example, when obstacles are contaminated zones or danger, the path need to be away from them. So, their weights should be smaller than safe obstacles. Our approximate fast algorithm computes the WVDs for polygons to get the Voronoi edges, which are the final optimal path. Figure 5a gives distribution of some obstacles with different weights. Then, we will get the optimal path by following the Voronoi edges (Fig. 5b).

Fig. 5
figure 5

Optimal path generation of obstacles

6 Conclusion and Future Work

The chapter presents a parallel approach for generating weighted raster-based Voronoi diagrams on Hadoop platform. The presented approach is based on traditional distance transformation and MapReduce model. Considering type and weight of generators, distance computation is simplified by extracting edge points of polygon. The experiments show that the approach significantly improves the performance with a linear scale-up in response to the increase in nodes. Application cases show that the approach is successfully applied in urban green space planning. Some further work can also be expected with the presented method: selecting reasonable weights for different cases and applying the algorithm to applications such as data collection in sensor networks, emergency modeling, and Voronoi-based geospatial query.