Abstract
The traditional task of association rule mining is to find all rules with high support and high confidence. In some applications, such as mining spatial datasets for natural resource location, the task is to find high confidence rules even though the support may be low. In still other applications, such as the identification of agricultural pest infestations, the task is to find high confidence rules preferably while the support is still very low. The basic Apriori algorithm cannot be used to solve these problems efficiently since it relies on first identifying all high support itemsets. In this paper, we propose a new model to derive high confidence rules for spatial data regardless of their support level. A new data structure, the Peano Count Tree (P-tree), is used in our model to represent all the information we need. P-trees represent spatial data bit-by-bit in a recursive quadrant-by-quadrant arrangement. Based on the P-tree, we build a special data cube, the Tuple Count Cube (T-cube), to derive high confidence rules. Our algorithm for deriving confident rules is fast and efficient. In addition, we discuss some strategies for avoiding over-fitting (removing redundant and misleading rules).
This work was partially supported by a U. S. - G. S. A. VAST grant. Patents are pending on the P-Tree Data Mining Technology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, T. Imielinski, A. Swami. Mining Association Rules Between Sets of Items in Large Database. ACM SIGMOD 1993.
R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules. VLDB 1994.
R. Srikant, R. Agrawal. Mining Quantitative Association Rules in Large Relational Tables. ACM SIGMOD 1996.
J. S. Park, M. Chen, P. S. Yu. An effective Hash-Based Algorithm for Mining Association Rules. ACM SIGMOD 1995.
J. Han, J. Pei, Y. Yin. Mining Frequent Patterns without Candidate Generation. ACM SIGMOD 2000.
R. J. Bayardo. Brute-Force Mining of High-Confidence Classification Rules. KDD 1997.
E. Cohen, et al. Finding Interesting Associations without Support Pruning. VLDB 2000.
K. Wang, S. Zhou, Y. He. Growing Decision Trees on Support-less Association Rules. KDD 2000.
V. Gaede, O. Gunther. Multidimensional Access Methods. Computing Surveys, 30(2), 1998.
H. Samet. The quadtree and related hierarchical data structure. ACM Computing Survey, 16, 2, 1984.
H. Samet. Applications of Spatial Data Structures. Addison-Wesley, 1990.
H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1990.
R. A. Finkel, J. L. Bentley. Quad trees: A data structure for retrieval of composite keys. Acta Informatica, 4, 1, 1974.
HH-code. Available at http://www.statkart.no/nlhdb/iveher/hhtext.htm
B. Liu, W. Hsu, Y. Ma. Integrating classification and association rule mining. KDD 1998.
J. Dong, W. Perrizo, Q. Ding and J. Zhou. The Application of Association Rule Mining on Remotely Sensed Data. ACM Symposium on Applied Computing, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Perrizo, W., Ding, Q., Ding, Q., Roy, A. (2001). Deriving High Confidence Rules from Spatial Data Using Peano Count Trees. In: Wang, X.S., Yu, G., Lu, H. (eds) Advances in Web-Age Information Management. WAIM 2001. Lecture Notes in Computer Science, vol 2118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47714-4_9
Download citation
DOI: https://doi.org/10.1007/3-540-47714-4_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42298-3
Online ISBN: 978-3-540-47714-3
eBook Packages: Springer Book Archive