Abstract
Tree patterns are fundamental to querying tree-structured data like XML. Because of the heterogeneity of XML data, it is often more appropriate to permit approximate query matching and return ranked answers, in the spirit of Information Retrieval, than to return only exact answers. In this paper, we study the problem of approximate XML query matching, based on tree pattern relaxations, and devise efficient algorithms to evaluate relaxed tree patterns. We consider weighted tree patterns, where exact and relaxed weights, associated with nodes and edges of the tree pattern, are used to compute the scores of query answers. We are interested in the problem of finding answers whose scores are at least as large as a given threshold. We design data pruning algorithms where intermediate query results are filtered dynamically during the evaluation process. We develop an optimization that exploits scores of intermediate results to improve query evaluation efficiency. Finally, we show experimentally that our techniques outperform rewriting-based and post-pruning strategies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel, D. Srivastava, and Y. Wu. Structural joins: Efficient matching of XML query patterns. In Proceedings of ICDE, 2002.
S. Boag, D. Chamberlin, M. Fernandez, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery 1.0: An XML query language. http://www.w3.org/TR/xquery.
Y. Chiaramella, P. Mulhem, and F. Fourel. A model for multimedia information retrieval. Technical report, FERMI ESPRIT BRA 8134, University of Glasgow. www.dcs.gla.ac.uk/fermi/tech_reports/reports/fermi96-4.ps.gz.
C. Delobel and M. C. Rousset. A uniform approach for querying large treestructured data through a mediated schema. In Proceedings of International Workshop on Foundations of Models for Information Integration (FMII), 2001.
A. Deutch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for XML. In Proceedings of WWW, 1999.
C. Faloutsos. Access methods for text. ACM Computing Surveys, 17(1), 1985.
N. Fuhr and K. Grossjohann. XIRQL: An extension of XQL for information retrieval. In Proceedings of SIGIR, 2001.
Y. Hayashi, J. Tomita, and G. Kikui. Searching text-rich XML documents with relevance ranking. In Proceedings of SIGIR Workshop on XML and Information Retrieval, 2000.
Y. Kanza and Y. Sagiv. Flexible queries over semistructured data. In Proceedings of PODS, 2001.
J. McHugh, S Abiteboul, R. Goldman, D. Quass, and J. Widom. Lore: A database management system for semistructured data. SIGMOD Record, 26(3):45–66, 1997.
S. Myaeng, D.-H. Jang, M.-S. Kim, and Z.-C. Zhoo. A flexible Model for retrieval of SGML documents. In Proceedings of SIGIR, 1998.
M. Persin. Document filtering for fast ranking. In Proceedings of SIGIR, 1994.
J. Robie, J. Lapp, and D. Schach. XML query language (XQL). Available from http://www.w3.org/TandS/QL/QL98/pp/xql.html.
G. Salton and M. J. McGill. Introduction to modern information retrieval. McGraw-Hill, New York, 1983.
T. Schlieder. Similarity search in XML data using cost-based query Transformations. In Proceedings of SIGMOD WebDB Workshop, 2001.
A. Theobald and G. Weikum. Adding relevance to XML. In Proceedings of SIGMOD WebDB Workshop, 2000.
H. Turtle and J. Flood. Query evaluation: Strategies and optimization. Information Processing & Management, Nov. 1995.
C. Zhang, J. Naughton, D. DeWitt, Q. Luo, and G. Lohman. On supporting containment queries in relational database management systems. In Proceedings of SIGMOD, 2001.
K. Zhang and D. Shasha. Tree pattern matching. Pattern Matching Algorithms, Apostolico and Galil (Eds.), Oxford University Press, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amer-Yahia, S., Cho, S., Srivastava, D. (2002). Tree Pattern Relaxation. In: Jensen, C.S., et al. Advances in Database Technology — EDBT 2002. EDBT 2002. Lecture Notes in Computer Science, vol 2287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45876-X_32
Download citation
DOI: https://doi.org/10.1007/3-540-45876-X_32
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43324-8
Online ISBN: 978-3-540-45876-0
eBook Packages: Springer Book Archive