Mining High Utility Co-location Patterns Based on Importance of Spatial Region

Zhao, Jiasong; Wang, Lizhen; Yang, Peizhong; Chen, Hongmei

doi:10.1007/978-981-13-0896-3_5

Jiasong Zhao^13,14,
Lizhen Wang¹³,
Peizhong Yang¹³ &
…
Hongmei Chen¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 849))

Included in the following conference series:

International Conference on Geo-Spatial Knowledge and Intelligence

1103 Accesses
1 Citations

Abstract

Co-location pattern mining aims at finding the subsets of spatial features whose instances are frequently located together in geographic space. Most studies mainly focus on whether spatial feature instances are frequently located together. However, the utilities of spatial instances in different space regions are different. Based on the importance of spatial a region, the utility value of the region is determined, and then a utility participation index of co-location patterns as a new interestingness measure is defined. We present a basic high utility co-location pattern mining algorithm. To reduce the computational cost, an improved mining algorithm with pruning strategy is developed by cutting down the search space. The experiments on synthetic and real world datasets show that the proposed methods are effective and efficient.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Efficiently Mining High Utility Co-location Patterns from Spatial Data Sets with Instance-Specific Utilities

Mining high influence co-location patterns from instances with attributes

Article 18 November 2019

Mining Co-location Patterns with Dominant Features

Keywords

1 Introduction

Owing to the rapid generation of spatial data and wide use of spatial databases, it’s essential to discover spatial knowledge automatically from spatial datasets. Spatial co-location mining is to discover a set of spatial features of which instances frequently appear in a spatial neighborhood of each other. Mining spatial co-location patterns is an important spatial data mining task with broad applications [1, 2].

The participation index is used to measure the prevalence (or frequency) of a co-location pattern in the traditional approaches of co-location pattern mining. It mainly focused on whether spatial feature instances are frequently located together. However, the utility of the feature instances in different spatial regions is different. For example, in the study of plant symbiosis, the symbiotic plants that grow in the poor ecological environment (soil, humidity, altitude, etc.) are more general than those that grow in the good ecological environment. Another example, in the research field of public safety, social effect of criminal cases occurring in the downtown is absolutely different from ones occurring in the school. Therefore, it is necessary to establish the importance of spatial region in mining spatial co-location patterns. After setting of region utility values based on regional importance, the utility value of the instance is the utility value of the region where the instance is located. By the way of mining high utility co-location patterns similar to the literature [3], the method of discovery high utility co-location patterns can not only detect some low-frequency but high-utility (interesting) patterns, but also remove some common-sense patterns (high-frequency but low-utility).

Here, we use a specific example to illustrate this problem. Figure 1 shows a sample spatial dataset consisting of four features, A, B, C, and D. A.2 represents the second instance of spatial feature A. Two instances are connected by edges (shown as solid lines) if they have a spatial neighbor relationship, and the whole space is divided into four equal regions. Suppose the utility values of these four regions are 8, 5, 2, and 1 respectively. According to the traditional co-location pattern mining method, the participation index of co-location {C, D} is 1/3. If the prevalence threshold is not less than 0.6, {C, D} would be regarded as a non-prevalent co-location. However, two row instances of {C, D}, {C.5, D.5} and {C.6, D.6}, all appear in Region 1 which shows the highest utility value. So the utility of each feature in {C, D} accounts for a large proportion of its total utility. {C, D} may be interesting to users. As to the pattern {B, C}, in contrast, the participation index of {B, C} is 2/3, the utility participation index of {B, C} is lower. So {B, C} may be non-interesting to users.

Therefore, the real interesting co-location patterns cannot be found by traditional approach because the importance of different regions is ignored. In the paper, we focus on high utility co-location mining based on importance of spatial regions.

1.1 Related Work

Morimoto et al. [4] first defined the problem of finding frequent neighboring co-locations in spatial databases. Huang et al. [5] proposed a general approach for mining co-location patterns, the join-based approach, which defined the participation index to measure the prevalence of a co-location. After that, different algorithms were proposed to improve the efficiency of the mining process, such as join-less algorithm [6] proposed by Yoo et al., and density based algorithm [2] proposed by Xiao et al. Wang et al. also presented a new join-less algorithm based on the CPI-tree [7] and an approach based on order-clique for discovering maximal co-locations [8]. Previous approaches focus on discovering global co-locations in a spatial dataset. The results of these approaches are usually not able to capture or represent the characteristics of different zones. Celik et al. [9] defined the problem of zonal co-location mining. He proposed an index structure based on the Quad-tree to support dynamic parameters for zonal co-location mining. Dai et al. [10] developed a new index structure and algorithm to mine zonal co-locations more efficiently. Sengstock et al. [11] introduced a new general class of interestingness measures that are based on the spatial distribution of co-location patterns. A new measurement using an evenness coefficient of the feature distribution was introduced, and a novel algorithm for discovering prevalent and evenly distributional co-location patterns was proposed in [12].

Much works on high utility pattern mining have presented different approaches in the transactional data sets [13]. A two-phase algorithm to efficiently prune down the number of candidates, through transaction-weighted downward closure property was presented in [14]. UP-Growth proposed in [15] enhances the mining performance in utility mining through maintaining the information of high utility itemsets by UP-tree. There are just a very limited number of studies on high utility co-location pattern mining. A framework for mining high utility co-locations was proposed in [16]. Wang et al. [17] discussed a problem of incremental mining high utility co-locations on spatial databases which are constantly changed with added and disappeared data. Recently Wang et al. [3] presented a method of mining high utility co-locations from spatial data sets with instance-specific utilities.

1.2 Contribution

The main contributions of this paper are as follows.

First, based on the importance of spatial regions, we propose a new measure called the utility participation index for high utility co-location pattern mining.

Second, we present a basic algorithm to discover high utility co-locations. In order to reduce the computational cost, an improved algorithm is given.

Finally, we evaluate our algorithms with experiments on both synthetic and real world data sets.

The remainder of this paper is organized as follows. Section 2 introduces basic concepts and definitions of high utility co-location pattern mining based on the importance of spatial regions. A basic algorithm and an improved algorithm with a pruning strategy are presented in Sect. 3. Experimental results are shown in Sect. 4. The last section is conclusions.

2 Basic Concepts and Problem Definition

2.1 Basic Concepts

A set of spatial features represents a collection of different kinds of items in space, as $ F = \left\{ {f_{1} ,f_{2} , \cdots ,f_{k} } \right\} $. An object with a specific spatial position is called a spatial instance, and a set of instances, as $ S = S_{1} \cup S_{1} \cup \ldots S_{k} $, in which $ S_{i} (1 \le i \le k) $ is an instance collection that corresponds to spatial feature. Each instance denoted by the feature type and a numeric id value e.g. B.1. If the Euclidean distance of two instances is not greater than the given threshold d, which means the two instances meet neighbor relationship R. A co-location pattern c is a set of spatial features, in which $ c \subseteq F $. The instance set I is a row instance of co-location pattern c, if (1) I contains all features of c, and no proper subset of I does so; (2) any two instances, i_i and i_j, $ i_{i} ,i_{j} \in I $, $ R(i_{i} ,i_{j} ) $, that is, i_i and i_j are neighbors. The set of all row instances of c is the table instance of c, denoted as T(c).

When the participation index of a co-location pattern is not less than the given threshold, it is referred as a prevalent co-location pattern. The participation index of co-location pattern c is expressed as PI (c), which is the minimum of participation ratios among all spatial features of c, defining as follows:

$$ PI(c) = \mathop {\hbox{min} }\limits_{{f_{i} \in c}} \{ Pr(c,f_{i} )\} $$

(1)

The participation ratio of spatial feature f_i in co-location pattern c is expressed as Pr (c, f_i), defined as:

$$ Pr(c,f_{i} ) = \frac{{{\text{Number of distinct instance of }}f_{i} {\text{ in any row instance of }}c}}{{{\text{Number of instance of }}f_{i} }} $$

(2)

2.2 Problem Definition

In practical applications, the effect of feature instances in different space regions is different. Therefore, we will set spatial region utility values based on the importance of each region, and then the region utilities will be transformed into the instances. In this section, some related definitions about mining high utility co-location patterns based on the region importance will be given.

Definition 1 (spatial region utility).

The spatial region utility is used to describe the importance of different spatial regions. We denote the spatial region utility of Region i as u _i.

Definition 2 (spatial instance utility).

Let spatial instance f_i.j be the j-th instance of feature f_i. We denote the utility of spatial instance f_i.j as u(f_i.j), which is the utility of region where f_i.j is located in it.

For example, in Fig. 1, A.2 is located in Region 2. u(A.2) = u₂ = 5.

Definition 3 (total utility of feature).

The total utility of a feature f_i is the sum of utilities of its instances, denoted as $ u(f_{i} ) = \sum\nolimits_{{f_{i} .j \in S_{i} }} {u(f_{i} .j)} $, where S_i is the a set of instances belonging to f_i.

For example, the total utility of feature A in Fig. 1 is u(A) = u(A.1) + u(A.2) + u(A.3) + u(A.4) + u(A.5) + u(A.6) = 1 + 5 + 2 × 3 + 8 = 20.

Definition 4 (utility of feature in co-location).

Given a size-k co-location pattern c = {f₁, f₂,…, f_k}, the utility of f_i in c is defined as the sum of utilities of instances belonging to feature f_i∈c in table instance T(c). It is denoted as $ u(c,f_{i} ) = \sum\nolimits_{{f_{i} .j \in \pi_{{f_{i} }} (T(c))}} {u(f_{i} .j)} $, where π is the relational projection operation with duplication elimination.

For example, in Fig. 1, considering the size-2 co-location pattern c = {C, D}, T(c) = {{C.5, D.5}, {C.6, D.6}}. The utility of D in c is u(c, D) = u(D.5) + u(D.6) = 8 + 8 = 16.

Definition 5 (Utility Participation Ratio, UPR).

The utility participation ratio of feature f_i in co-location pattern c is defined as the proportion of f_i’s utility in c to its total utility. The utility participation ratio can be computed as:

$$ UPR(c,f_{i} ) = \frac{{u(c,f_{i} )}}{{u(f_{i} )}} $$

(3)

For example, in Fig. 1, for pattern c = {C, D}, the utility participation ratio of each feature in c is computed as

$$ UPR(c,C) = \frac{u(C.5) + u(C.6)}{u(C)} = \frac{16}{26},UPR(c,D) = \frac{u(D.5) + u(D.6)}{u(D)} = \frac{16}{22}. $$

Definition 6 (Utility Participation Index, UPI).

The utility participation index UPI(c) of a co-location pattern c = {f₁, f₂,…, f_k} is the minimum in all UPR(c, f_i) of co-location c:

$$ UPI(c) = \mathop {\hbox{min} }\nolimits_{i = 1}^{k} \left\{ {UPR(c,f_{i} )} \right\} $$

(4)

Definition 7 (high utility co-location pattern, UCP).

Given a minimum UPI threshold θ, a co-location pattern c is a high utility co-location pattern if UPI(c) ≥ θ holds.

The prevalent patterns may not be high utility patterns and the high utility patterns may not be prevalent as well. For example, for patterns {B, C} and {C, D} in Fig. 1, PI({B, C}) = 0.67 and UPI({B, C}) = 0.38, while PI({C, D}) = 0.33 and UPI({C, D}) = 0.62. If both of minimum PI threshold and minimum UPI threshold are 0.6, we can identify {B, C} is prevalent easily, but {C, D} is not. At the same time, {C, D} is a high utility co-location pattern, but {B, C} is not.

Lemma 1.

The utility participation ratio and the utility participation index are antimonotone (monotonically non-increasing) as the size of the co-location increases.

Proof: The utility participation ratio is antimonotonic because a spatial feature instance that participates in a row instance of a co-location c also participates in a row instance of a co-location $ c' $, where $ c' \subseteq c $. Because the utility participation index is the minimum in all UPR(c, f_i) of co-location c, when the size of the co-location increases, the utility participation index is decreases, so it is also monotonically non-increasing.

Theorem 1.

If a size-k co-location pattern c_k= {f₁, f_2,… f_k} is the high utility co-location pattern, each size-(k-1) co-location pattern $ c_{k - 1} \subset c_{k} $ must be the high utility co-location.

Proof: According to lemma 1, it is easy to be proved.

3 UCP Mining

3.1 Basic Algorithm

The basic UCP mining algorithm uses the generate-and-test methods, that is, generate candidates, and test each candidate to identify whether it is a high utility co-location pattern. Algorithm 1 shows the pseudocode of the basic UPC mining.

Initialization (Step 1–3):

First of all, according to the importance of the regions, the data space is divided into m regions, and the utility value for these regions is set, respectively, u₁, u₂,…, u_m. And then size-1 high utility co-location patterns sets and size-1 table instance sets are initialized.

Generating Candidate Co-locations (Step 4–6):

Given a spatial data set and a distance threshold, all neighboring instance pairs can be found by using a geometric method such as plane sweep. Meanwhile, the set of size-2 candidates and the set of table instances of C₂ can be generated from the neighbor instance pairs. For k > 2, size-k candidate co-locations and table instances of them are generated from size-(k-1) high utility co-location patterns. Here, we use the antimonotone of utility participation index to prune the set of generated candidate patterns.

Discovering High Utility Co-locations (Step 7–8):

First, we calculate the UPI of each candidate co-location according to the Definition 6. Then, we identify high utility co-location patterns by the UPIs of candidate co-locations and the given UPI threshold θ.

Steps 4–9 are repeated with the increment of size k. Note that the row instances which lie across two regions are ignored in Algorithm 1.

3.2 Improved Algorithm

The basic UCP mining algorithm can efficiently discover utility co-location patterns due to the antimonotone property of UPI. In this section, we present an improved UCP mining algorithm for reducing the computational cost.

As shown in Fig. 1, the instances of feature B and C in pattern {B, C} exist in a high utility region, but there is no row instance of {B, C}. Therefore, the utility participation index of {B, C} is low. Through this observation, we get the improvement of basic algorithm: m regions are sorted from high to low according to the utility values, and the regional number is 1, 2,…, m in turn. And then we mine high utility co-location pattern with the strategy of generate-and-test based on the sorting regions.

Definition 7 (utility of feature within a region).

Within region l, suppose $ S_{i}^{l} $ is a set of instances belonging to a feature f_i. The utility of f_i within region l is denoted as $ u_{l} (f_{i} ) = \sum\nolimits_{{f_{i} .j \in S_{i}^{l} }} {u(f_{i} .j)} $.

For example, in Fig. 1, the utility of feature A within Region 1 is u₁(A) = u(A.6) = 8.

Definition 8 (utility of feature in co-location within a region).

Within region l, given a size-k co-location pattern c = {f₁, f₂,…, f_k}, the utility of f_i in c is defined as the sum of utilities of instances belonging to feature f_i∈ c in R^l(c), where R^l(c) is the set of row instance of c within region l. It is denoted as $ u_{l} (c,f_{i} ) = \sum\nolimits_{{f_{i} .j \in \pi_{{f_{i} }} (R^{l} (c))}} {u(f_{i} .j)} $, where π is the relational projection operation with duplication elimination.

For example, in Fig. 1, within Region 1, considering the size-2 co-location pattern c = {A, C}, R¹(c) = {A.6, C.6}. The utility of A in c within Region 1 is u₁(c, A) = u(A.6) = 8.

Lemma 2.

Let m regions are sorted from high to low according to the utility value, and the regional number is from 1, 2,…, m in turn. After the related utility values of the first l regions are calculated, the maximum utility participation ratio of feature f_i is UPR (c, f_{i) maxl}.

$$ UPR(c,f_{i} )_{\hbox{max} l} = \frac{{u(f_{i} ) - \sum\limits_{r = 1}^{l} {u_{r} (f_{i} )} + \sum\limits_{r = 1}^{l} {u_{r} (c,f_{i} )} }}{{u(f_{i} )}} $$

(5)

Proof:

$$ \begin{aligned} & UPR(c,f_{i} ) = \frac{{\sum\limits_{r = 1}^{l} {u_{r} (c,f_{i} )} + \sum\limits_{r = l + 1}^{m} {u_{r} (c,f_{i} )} }}{{u(f_{i} )}} \le \frac{{\sum\limits_{r = 1}^{l} {u_{r} (c,f_{i} )} + \sum\limits_{r = l + 1}^{m} {u_{r} (f_{i} )} }}{{u(f_{i} )}} \\ & UPR(c,f_{i} ) \le \frac{{\sum\limits_{r = 1}^{l} {u_{r} (c,f_{i} )} + u(f_{i} ) - \sum\limits_{r = 1}^{l} {u_{r} (f_{i} )} }}{{u(f_{i} )}} \\ \end{aligned} $$

Theorem 2.

Given a size-k co-location pattern c = {f₁, f₂, …, f_k}, if UPR (c, f_{i) maxl}< θ holds, then c must be a non-high utility co-location pattern, i.e., pattern c can be pruned.

Proof:

According to Definition 6 and Lemma 2, we have UPI (c) ≤ UPR (c, f_i) ≤ UPR (c, f_{i) maxl}. So, when UPR (c, f_{i) maxl}< θ holds, pattern c can be pruned.

For example, in Fig. 1, for pattern c = {B, C}, we have

$$ \begin{array}{*{20}c} {UPR(c,B)_{\hbox{max} 1} = \frac{{u(B) - u_{1} (B) + u_{1} (c,B)}}{u(B)} = \frac{13 - 0 + 0}{13} = 1} \\ {UPR(c,C)_{\hbox{max} 1} = \frac{{u(C) - u_{1} (C) + u_{1} (c,C)}}{u(C)} = \frac{26 - 16 + 0}{26} \approx 0.38} \\ \end{array} $$

If minimum UPI threshold θ is 0.6, pattern {B, C} can be pruned by Theorem 2.

According to Theorem 2, the basic mining algorithm can be optimized. Instead of scanning the entire dataset, candidate patterns and their table instances can be generated and UPI can be calculated. In particular, the initialization is similar to the basic algorithm, but the m regions should be sorted from high to low (step 2) according to their utility values. Then, we use size-(k-1) high utility co-location patterns to generate size-k candidate co-locations (step 5). And then, the table instance t_c is generated according to the regional order, one by one. Finally, the UPR (c, f_{i) maxl} is calculated. The unpruned candidate pattern C_k is the high utility pattern U_k (step 6–17). In the process of generating the table instance t_c of each pattern c, the time consumption for the improved algorithm will be reduced as a result of reducing the scanning scope of the dataset. Algorithm 2 shows the pseudocode of the improved UPC mining.

4 Experimental Evaluation

In this section, we perform a series of experiments to verify the effect and the efficiency of the basic UCP (BUCP) algorithm and the improved UCP (IUCP) algorithm on synthetic and real datasets.

The algorithms are implemented in JAVA and are memory-based algorithms. All the experiments were performed on an Intel core i7-6700 3.4 GHz PC with 8G MB main memory, running on Microsoft Windows10.

4.1 Datasets

For the synthetic dataset, we generate instances and randomly distribute them into a 1000 × 1000 space, and the utilities of regions are assigned randomly between 1 and 10. For the real dataset, we use a plant dataset of the “Three Parallel Rivers of Yunnan Protected Areas”, we set the utilities of regions by an expert judge method.

4.2 Quality of Mining Results

The criterion $ Q(c) = {{\sum\nolimits_{f \in c} {u(c,f)} } \mathord{\left/ {\vphantom {{\sum\nolimits_{f \in c} {u(c,f)} } {\sum\nolimits_{f \in c} {u(f)} }}} \right. \kern-0pt} {\sum\nolimits_{f \in c} {u(f)} }} $ is used to evaluate the quality of a mined co-location pattern c, and the bigger Q(c) is, c has the higher utilities [3]. We compare the quality of mining results identified by the traditional participation index (PI) and the UPI proposed in our paper.

Mining results on the synthetic dataset. The number of spatial features we take is 20; the total number of instances n is 15 K; the distance threshold d is 25; and the number of regions m is 9. Figure 2(a) shows the sum of quality of top-k interesting co-location patterns identified by the measure PI and UPI respectively. The results show that UPI measure can discover higher quality co-location patterns.

Mining results on the real dataset. In Fig. 2(b), we use a real-world plant distribution dataset which the number of plant species (spatial features) is 25 and the number of instances is 13348 in a 90 km × 90 km area. According to the soil properties, this space is divided into 7 regions. With the same results of the synthetic dataset, the quality of mining results by UPI measure is still better than the quality of mining results by PI measure.

4.3 UCP Mining Performance Evaluation

In this subsection, we compare running time of BUCP algorithm and IUCP algorithm on synthetic datasets.

Effect of the number of instances n. The number of spatial features we set is 20; the distance threshold d is 10; the number of regions m is 25; and the utility participation index threshold θ is 0.3, the running time of two algorithms by increasing the number of instances n is shown in Fig. 3(a). We can see that with the increase of n, the running time of the two algorithms is increased. This is because that the increase of n will produce more clique instances, leading to more time consumption. When n > 70 K, the algorithm IUCP is obviously superior to that of BUCP, because the IUCP algorithm can prune non-high utility candidate patterns through scanning parts of data space.

Effect of the distance threshold d. In Fig. 3(b), the number of spatial features is 20, n = 20 K, m = 25, θ = 0.3. Similarly, the running time is increased with the increase of d, because a larger value of d means more instances which could form cliques to bring more join operations. The performance of IUCP algorithm is still better than BUCP algorithm, as there is a pruning strategy in IUCP algorithm.

Effect of the utility participation index threshold θ. In Fig. 3(c), the number of spatial features is 20, n = 60 K, m = 25, d = 10. As θ becomes higher, more patterns dissatisfy the condition of the high utility patterns, naturally, the performance of two algorithms meliorate. When θ < 0.3, the running time of IUCP algorithm is less than BUCP. When θ > 0.3, the running time of the two algorithms is almost equal. This is because that when θ value is larger, BUCP algorithm has greatly reduced the running time by using antimonotone property of UPI.

Effect of number of regions m. In Fig. 3(d), the number of spatial features is 20, n = 60 K, d = 10, θ = 0.3. When m > 25, the running time of IUCP algorithm is less than that of BUCP algorithm. Besides, with the increase of m, the pruning utility of IUCP algorithm is better and its running time is less, while the UPI calculation amount of the algorithm BUCP is increased and its running time is slightly increased. When m < 25, with the increase of m, the running time of the two algorithms is reduced. However, the running time of IUCP algorithm is more than BUCP, because the pruning strategy of IUCP algorithm is invalid and it scans the entire datasets when identifying high utility patterns. In this way, the table instance is generated in each region and UPI is calculated, thus, increasing the running time.

5 Conclusions

In this paper, we present a method to find high utility co-location patterns based on importance of spatial regions. We use the utility participation index of the patterns as interestingness measure and present BUCP algorithm and IUCP algorithm with a pruning strategy. The experimental results show that we can effectively mine the high utility co-location patterns. IUCP algorithm achieves superior performance than to BUCP algorithm. In the future, we plan to validate our method with different types of datasets.

References

Shekhar, S., Huang, Y.: Discovering spatial co-location patterns: a summary of results. In: Proceedings of 7th International Symposium on Advances in Spatial and Temporal Databases (SSTD), pp. 236–256 (2001)
Chapter Google Scholar
Xiao, X., Xie, X., Luo, Q., Ma, W.: Density based co-location pattern discovery. In: Proceedings of 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 1–10 (2008)
Google Scholar
Wang, L., Jiang, W., Chen, H., Fang, Y.: Efficiently mining high utility co-location patterns from spatial data sets with instance-specific utilities. In: Proceedings of 22nd International Conference on Database Systems for Advanced Applications, pp. 458–474 (2017)
Chapter Google Scholar
Morimoto, Y.: Mining frequent neighboring class sets in spatial databases. In: Proceedings of 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 353–358 (2001)
Google Scholar
Huang, Y., Shekhar, S., Xiong, H.: Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans. Knowl. Data Eng. 16(12), 1472–1485 (2004)
Article Google Scholar
Yoo, J.S., Shekhar, S.: A joinless approach for mining spatial colocation patterns. IEEE Trans. Knowl. Data Eng. 18(10), 1323–1337 (2006)
Article Google Scholar
Wang, L., Bao, Y., Lu, J., Yip, J.: A new join-less approach for co-location pattern mining. In: Proceedings of 8th IEEE International Conference on Computer and Information Technology (CIT2008), pp. 197–202 (2008)
Google Scholar
Wang, L., Zhou, L., Lu, J., Yip, J.: An order-clique-based approach for mining maximal co-locations. Inf. Sci. 179(19), 3370–3382 (2009)
Article Google Scholar
Celik, M., Kang, J.M., Shekhar, S.: Zonal co-location pattern discovery with dynamic parameters. In: Proceedings of 7th IEEE International Conference on Data Mining, pp. 433–438 (2007)
Google Scholar
Dai, B.R., Lin, M.Y.: Efficiently mining dynamic zonal co-location patterns based on maximal co-locations. In: Proceedings of 11th IEEE International Conference on Data Mining Workshops, pp. 861–868 (2011)
Google Scholar
Sengstock, C., Gertz, M., Canh, T.V.: Spatial interestingness measures for co-location pattern mining. In: Proceedings of 12th IEEE International Conference on Data Mining Workshops, pp. 821–826 (2012)
Google Scholar
Zhao, J., Wang, L., Bao, X., Tan, Y.: Mining co-location patterns with spatial distribution characteristics. In: International Conference on Computer, Information and Telecommunication Systems, pp. 26–30 (2016)
Google Scholar
Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings of 4th SIAM International Conference on Data Mining, pp. 482–486 (2004)
Chapter Google Scholar
Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, pp. 689–695 (2005)
Chapter Google Scholar
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-Growth: an efficient algorithm for high utility itemset mining. In: Proceedings of 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)
Google Scholar
Yang, S., Wang, L., Bao, X., Lu, J.: A framework for mining spatial high utility co-location patterns. In: Proceedings 12th International Conference on Fuzzy Systems and Knowledge Discovery, pp. 595–601 (2015)
Google Scholar
Wang, X., Wang, L.: Incremental mining of high utility co-locations from spatial database. In: IEEE International Conference on Big Data and Smart Computing, pp. 215–222 (2017)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61472346, 61662086, 61762090), the Natural Science Foundation of Yunnan Province (2015FB114,2016FA026), the Project of Innovative Research Team of Yunnan Province (XT412011), and the Spectrum Sensing and Borderlands Security Key Laboratory of Universities in Yunnan (C6165903).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Yunnan University, Kunming, China
Jiasong Zhao, Lizhen Wang, Peizhong Yang & Hongmei Chen
Department of Electronic and Information Engineering, Yunnan Agricultural University, Kunming, China
Jiasong Zhao

Authors

Jiasong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Lizhen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peizhong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hongmei Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lizhen Wang .

Editor information

Editors and Affiliations

Beijing Institute of Technology, Beijing, China
Hanning Yuan
Beijing Institute of Technology, Beijing, China
Jing Geng
Beijing Institute of Technology, Beijing, China
Chuanlu Liu
Wuhan University, Wuhan, China
Fuling Bian
Beijing Institute of Technology, Beijing, China
Tisinee Surapunt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, J., Wang, L., Yang, P., Chen, H. (2018). Mining High Utility Co-location Patterns Based on Importance of Spatial Region. In: Yuan, H., Geng, J., Liu, C., Bian, F., Surapunt, T. (eds) Geo-Spatial Knowledge and Intelligence. GSKI 2017. Communications in Computer and Information Science, vol 849. Springer, Singapore. https://doi.org/10.1007/978-981-13-0896-3_5

Download citation

DOI: https://doi.org/10.1007/978-981-13-0896-3_5
Published: 12 June 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0895-6
Online ISBN: 978-981-13-0896-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics