Abstract.
XML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all types of data sharing. The ubiquity of XML has removed one of the major obstacles to integrating data from widely disparate sources - namely, the heterogeneity of data formats. However, general-purpose integration of data across the wide are a also requires a query processor that can query data sources on demand, receive streamed XML data from them, and combine and restructure the data into new XML output - while providing good performance for both batch-oriented and ad hoc, interactive queries. This is the goal of the Tukwila data integration system, the first system that focuses on network-bound, dynamic XML data sources. In contrast to previous approaches, which must read, parse, and often store entire XML objects before querying them, Tukwila can return query results even as the data is streaming into the system. Tukwila is built with a new system architecture that extends adaptive query processing and relational-engine techniques into the XML realm, as facilitated by a pair of operators that incrementally evaluate a query's input path expressions as data is read. In this paper, we describe the Tukwila architecture and its novel aspects, and we experimentally demonstrate that Tukwila provides better overall query performance and faster initial answers than existing systems, and has excellent scalability.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Author information
Authors and Affiliations
Additional information
Received: December 15, 2001 / Accepted: July 1, 2002 Published online: December 13, 2002
RID="*"
ID="*" Supported in part by an IBM Research Fellowship.
RID="**"
Rights and permissions
About this article
Cite this article
Ives, Z., Halevy, A. & Weld, D. An XML query engine for network-bound data. VLDB 11, 380–402 (2002). https://doi.org/10.1007/s00778-002-0078-5
Issue Date:
DOI: https://doi.org/10.1007/s00778-002-0078-5