Abstract
We present an algorithm for mining frequent queries in arbitrary relational databases, over which functional dependencies are assumed. Building upon previous results, we restrict to the simple, but appealing subclass of simple conjunctive queries. The proposed algorithm makes use of the functional dependencies of the database to optimise the generation of queries and prune redundant queries. Furthermore, our algorithm is capable of detecting previously unknown functional dependencies that hold on the database relations as well as on joins of relations. These detected dependencies are subsequently used to prune redundant queries. We propose an efficient database-oriented implementation of our algorithm using SQL, and provide several promising experimental results.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Association Rule
- Functional Dependency
- Mining Association Rule
- Conjunctive Query
- Minimum Support Threshold
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 309–328. AAAI-MIT Press (1996)
Bocklandt, R.: http://www.persecondewijzer.net
Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: ICDE, pp. 746–755 (2007)
Dehaspe, L., De Raedt, L.: Mining association rules in multiple relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)
Diop, C.T., Giacometti, A., Laurent, D., Spyratos, N.: Composition of mining contexts for efficient extraction of association rules. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 106–123. Springer, Heidelberg (2002)
Goethals, B., Van den Bussche, J.: Relational association rules: getting warmer. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 125–139. Springer, Heidelberg (2002)
Goethals, B., Hoekx, E., Van den Bussche, J.: Mining tree queries in a graph. In: ACM KDD, pp. 61–69 (2005)
Goethals, B., Le Page, W., Mannila, H.: Mining association rules of simple conjunctive queries. In: SIAM-SDM, pp. 96–107 (2008)
Han, J., Fu, Y., Wang, W., Koperski, K., Zaiane, O.: Dmql: A data mining query language for relational databases. In: SIGMOD-DMKD 1996, pp. 27–34 (1996)
Hoekx, E., Van den Bussche, J.: Mining for tree-query associations in a graph. In: IEEE ICDM, pp. 254–264 (2006)
IMDB (2008), http://imdb.com
Inokuchi, A., Washio, T., Motoda, H.: An Apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Jen, T.Y., Laurent, D., Spyratos, N.: Mining all frequent selection-projection queries from a relational table. In: EDBT 2008, pp. 368–379. ACM Press, New York (2008)
Jen, T.Y., Laurent, D., Spyratos, N.: Mining frequent conjunctive queries in star schemas. In: International Database Engineering and Applications Symposium (IDEAS), pp. 97–108. ACM Press, New York (2009)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: IEEE ICDM, pp. 313–320 (2001)
Meo, R., Psaila, G., Ceri, S.: An extension to sql for mining association rules. Data Mining and Knowledge Discovery 9, 275–300 (1997)
Turmeaux, T., Salleb, A., Vrain, C., Cassard, D.: Learning caracteristic rules relying on quantified paths. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 471–482. Springer, Heidelberg (2003)
Weisstein, E.W.: Restricted growth string. In: From MathWorld – A Wolfram Web Resource (2009), http://mathworld.wolfram.com/RestrictedGrowthString.html
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: IEEE ICDM, p. 721 (2002)
Zaki, M.J.: Efficiently mining frequent trees in a forest. In: ACM KDD, pp. 71–80 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Goethals, B., Laurent, D., Le Page, W. (2010). Discovery and Application of Functional Dependencies in Conjunctive Query Mining. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2010. Lecture Notes in Computer Science, vol 6263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15105-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-15105-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15104-0
Online ISBN: 978-3-642-15105-7
eBook Packages: Computer ScienceComputer Science (R0)