Abstract
The eXtensible Markup Language (XML) rapidly emerged as a standard for representing and exchanging information. The fastgrowing amount of available XML data sets a pressing need for languages and tools to manage collections of XML documents, as well as to mine interesting information out of them. Although the data mining community has not yet rushed into the use of XML, there have been some proposals to exploit XML. However, in practice these proposals mainly rely on more or less traditional relational databases with an XML interface. In this paper, we introduce association rules from native XML documents and discuss the new challenges and opportunities that this topic sets to the data mining community. More specifically, we introduce an extension of XQuery for mining association rules. This extension is used throughout the paper to better define association rule mining within XML and to emphasize its implications in the XML context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Mining association rules between sets of items in large databases. In P. Buneman and S. Jajodia, editors, SIGMOD93, pages 207–216, Washington, D.C., USA, May 1993.
Rakesh Agrawal, Tomasz Imielinski, and Arun N. Swami. Mining association rules between sets of items in large databases. In Peter Buneman and Sushil Jajodia, editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207–216, Washington, D.C., 26-28 1993.
Rakesh Agrawal and Ramakrishnan Srikant. Mining sequential patterns. In Philip S. Yu and Arbee L. P. Chen, editors, Proc. 11th Int. Conf. Data Engineering, ICDE, pages 3–14. IEEE Press, 6-10 1995.
Helena Ahonen, Oskari Heinonen, Mika Klemettinen, and A. Inkeri Verkamo. Mining in the phrasal frontier. In Principles of Data Mining and Knowledge Discovery, pages 343–350, 1997.
J. Han and Y. Fu. Discovery of multiple-level association rules from large databases. In Proc. of 1995 Int’l Conf. on Very Large Data Bases (VLDB’95), Zürich, Switzerland, September 1995, pages 420–431, 1995.
Jiawei Han and Micheline Kamber. Data Mining Concepts and Techniques. Morgan Kaufmann, San Francisco (CA).
Tomasz Imielinski and Aashu Virmani. MSQL: A query language for database mining, 1999.
Brian Lent, Arun N. Swami, and Jennifer Widom. Clustering association rules. In ICDE, pages 220–231, 1997.
Heikki Mannila, Hannu Toivonen, and et al. Discovering frequent episodes in sequences (extended abstract), August 1995.
Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Efficient algorithms for discovering association rules. In Usama M. Fayyad and Ramasamy Uthurusamy, editors, AAAI Workshop on Knowledge Discovery in Databases (KDD-94), pages 181–192, Seattle, Washington, 1994. AAAI Press.
Rosa Meo, Giuseppe Psaila, and Stefano Ceri. A new sql-like operator for mining association rules. In VLDB'96, September 3–6, 1996, Mumbai (Bombay), India, pages 122–133.
Rosa Meo, Giuseppe Psaila, and Stefano Ceri. A tightly-coupled architecture for data mining. In ICDE, pages 316–323, Orlando, Florida, USA, February 1998.
M. Rajman and R. Besanon. Text mining: Natural language techniques and text mining applications, 1997.
Lisa Singh, Peter Scheuermann, and Bin Chen. Generating association rules from semi-structured documents using an extended concept hierarchy. In CIKM, pages 193–200, 1997.
Ramakrishnan Srikant and Rakesh Agrawal. Mining generalized association rules. In The VLDB Journal, pages 407–419, 1995.
Ramakrishnan Srikant and Rakesh Agrawal. Mining quantitative association rules in large relational tables. In H. V. Jagadish and Inderpal Singh Mumick, editors, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 1–12, Montreal, Quebec, Canada, 4-6 1996.
World Wide Web Consortium. Extensible Markup Language (XML) Version 1.0 W3C Recommendation. http://www.w3c.org/xml/, February 1998.
World Wide Web Consortium. XML Path Language (XPath) Version 1.0, W3C Recommendation. http://www.w3c.org/tr/xpath/, November 1999.
World Wide Web Consortium. XQuery 1.0: An XML Query Language W3C Working Draft. http://www.w3.org/TR/2001/WD-xquery-20010607, June 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Braga, D., Campi, A., Klemettinen, M., Lanzi, P. (2002). Mining Association Rules from XML Data. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2002. Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_3
Download citation
DOI: https://doi.org/10.1007/3-540-46145-0_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44123-6
Online ISBN: 978-3-540-46145-6
eBook Packages: Springer Book Archive