Abstract
This paper presents a novel region discovery framework geared towards finding scientifically interesting places in spatial datasets. We view region discovery as a clustering problem in which an externally given fitness function has to be maximized. The framework adapts four representative clustering algorithms, exemplifying prototype-based, grid-based, density-based, and agglomerative clustering algorithms, and then we systematically evaluated the four algorithms in a real-world case study. The task is to find feature-based hotspots where extreme densities of deep ice and shallow ice co-locate on Mars. The results reveal that the density-based algorithm outperforms other algorithms inasmuch as it discovers more regions with higher interestingness, the grid-based algorithm can provide acceptable solutions quickly, while the agglomerative clustering algorithm performs best to identify larger regions of arbitrary shape. Moreover, the results indicate that there are only a few regions on Mars where shallow and deep ground ice co-locate, suggesting that they have been deposited at different geological times.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Wang, W., Yang, J., Muntz, R.R.: STING: A statistical information grid approach to spatial data mining. In: 23rd Intl. Conf. on Very Large Data Bases (1997)
Koperski, K., Han, J.: Discovery of spatial association rules in geographic information databases. In: Egenhofer, M.J., Herring, J.R. (eds.) Procs. of the 4th Intl. Symp. Advances in Spatial Databases, vol. 951, 6–9, pp. 47–66 (1995)
Shekhar, S., Huang, Y.: Discovering spatial co-location patterns: A summary of results. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds.) SSTD 2001. LNCS, vol. 2121, Springer, Heidelberg (2001)
Eick, C.F., Vaezian, B., Jiang, D., Wang, J.: Discovering of interesting regions in spatial data sets using supervised clustering. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, Springer, Heidelberg (2006)
Choo, J., Jiamthapthaksin, R., Sheng Chen, C., Celepcikay, O.U., Giusti, C., Eick, C.F.: MOSAIC: A proximity graph approach for agglomerative clustering. In: The 9th Intl. Conf. on Data Warehousing and Knowledge Discovery (2007)
Brimicombe, A.J.: Cluster detection in point event data having tendency towards spatially repetitive events. In: The 8th Intl. Conf. on GeoComputation (2005)
Tay, S.C., Hsu, W., Lim, K.H.: Spatial data mining: Clustering of hot spots and pattern recognition. In: The Intl. Geoscience & Remote Sensing Symposium (2003)
Kulldorff, M.: Prospective time periodic geographical disease surveillance using a scan statistic. Journal Of The Royal Statistical Society Series A 164, 61–72 (2001)
Silverman, B.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, Boca Raton (1986)
Karypis, G., Han, E.H.S., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer 32(8), 68–75 (1999)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Gabriel, K.R., Sokal, R.R.: A new statistical approach to geographic variation analysis. Systematic Zoology 18, 259–278 (1969)
Jiang, D., Eick, C.F., Chen, C.: On supervised density estimation techniques and their application to clustering. In: Procs. of the 15th ACM Intl. Symposium on Advances in Geographic Information Systems (2007)
Feldman, W.C.: Global distribution of near-surface hydrogen on mars. J. Geophys. Res. 109, E09006 (2004)
Barlow, N.G.: Crater size-distribution and a revised martian relative chronology. Icarus 75(20), 285–305 (1988)
Data Mining and Machine Learning Group, University of Houston: CougarSquared Data Mining and Machine Learning Framework (2007), https://cougarsquared.dev.java.net/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ding, W., Jiamthapthaksin, R., Parmar, R., Jiang, D., Stepinski, T.F., Eick, C.F. (2008). Towards Region Discovery in Spatial Datasets. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-68125-0_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)