Abstract
Summarization based on GenSpace graphs aggregates data into summaries in many ways and identifies summaries that are far from user expectations. Mining interesting summaries in GenSpace graphs involves expectation propagation and interestingness measure calculation in the graphs. Both the propagation and the calculation need to traverse the GenSpace graph, but the number of the nodes in the GenSpace graph is exponential in the number of attributes. In this paper, we propose pruning methods in the different steps of the mining process: pruning nodes in ExGen graphs before constructing the GenSpace, pruning nodes in GenSpaces before propagation, and pruning nodes in GenSpaces during propagation. With these methods we make the traverse more efficient, by reducing the number of the nodes visited and the number of records scanned in the nodes. We also present experimental results on the Saskatchewan weather data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Beyer, K., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg CUBEs. In: Proceedings of ACM SIGMOD, pp. 359–370 (1999)
Cai, Y., Cercone, N., Han, J.: Attribute-oriented Induction in Relational Databases. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 213–228. AAAI Press, Menlo Park (1991)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery: An Overview. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, R., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 1–34. AAAI Press, Menlo Park (1996)
Geng, L., Hamilton, H.J.: Expectation Propagation in ExGen Graphs for Summarization, Tech. Report CS-2003-03, Department of Computer Science, University of Regina (May 2003)
Hamilton, H.J., Geng, L., Findlater, L., Randall, D.J.: Spatio-Temporal Data Mining with Expected Distribution Domain Generalization Graphs. In: Proceedings 10th Symposium on Temporal Representation and Reasoning / International Conference on Temporal Logic (TIME-ICTL 2003), July 2003, pp. 181–191. IEEE CS Press, Cairns (2003)
Hamilton, H.J., Hilderman, R.J., Cercone, N.: Attribute-oriented Induction Using Domain Generalization Graphs. In: Proc. Eighth IEEE International Conference on Tools with Artificial Intelligence (ICTAI 1996), Toulouse, France, November 1996, pp. 246–253 (1996)
Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: Proceedings of ACM SIGMOD 1996, 205-216 (1996)
Sarawagi, S.: Explaining Differences in Multidimensional Aggregates. In: Proc. of the 25th Int’l Conference on Very Large Databases, VLDB (1999)
Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery Driven Exploration of OLAP Data Cubes. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, p. 168. Springer, Heidelberg (1998)
Randall, D.J., Hamilton, H.J., Hilderman, R.J.: Temporal Generalization with Domain Generalization Graphs. International Journal of Pattern Recognition and Artificial Intelligence 13(2), 195–217 (1999)
Yao, Y.Y., Zhong, N.: Potential applications of granular computing in knowledge discovery and data mining. In: Proceedings of World Multiconference on Systemics, Cybernetics and Informatics, Orlando, Florida, USA. Computer Science and Engineering, vol. 5, pp. 573–580 (1999)
Zytkow, J.: From Contingency Tables to Various Forms of Knowledge in Databases. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, R., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 329–349. AAAI Press, Menlo Park (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Geng, L., Hamilton, H.J. (2004). Finding Interesting Summaries in GenSpace Graphs Efficiently. In: Tawfik, A.Y., Goodwin, S.D. (eds) Advances in Artificial Intelligence. Canadian AI 2004. Lecture Notes in Computer Science(), vol 3060. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24840-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-24840-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22004-6
Online ISBN: 978-3-540-24840-8
eBook Packages: Springer Book Archive