Abstract.
In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In this article we study the problem of computing the complete answer to a query, i.e., the answer that could be computed if all the tuples could be retrieved. A query is stable if for any instance of the relations in the query, its complete answer can be computed using the access patterns permitted by the relations. We study the problem of testing stability of various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We give algorithms and complexity results for these classes of queries. We show that stability of datalog programs is undecidable, and give a sufficient condition for stability of datalog queries. Finally, we study data-dependent computability of the complete answer to a nonstable query, and propose a decision tree for guiding the process to compute the complete answer.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aho AV, Sagiv Y, Ullman JD (1979) Efficient optimization of a class of relational expressions. ACM Trans Database Syst 4(4):435-454
Aho AV, Sagiv Y, Ullman JD (1979) Equivalences among relational expressions. SIAM J Comput 8(2):218-246
Apers PMG, Hevner AR, Yao SB (1983) Optimization algorithms for distributed queries. IEEE Trans Software Eng 9(1):57-68
Beeri C, Ramakrishnan R (1987) On the power of magic. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Databases, pp 269-283
Chandra AK, Merlin PM (1977) Optimal implementation of conjunctive queries in relational data bases. ACM Symposium on Theory of Computing, pp 77-90
Chawathe SS,(1994) The TSIMMIS project: integration of heterogeneous information sources. Information Processing Society of Japan, pp 7-18
Cinemachine. http://www.cinemachine.com/
Cosmadakis SS, Gaifman H, Kanellakis PC, Vardi MY (1988) Decidable optimization problems for database logic programs. ACM Symposium on Theory of Computing, pp 477-490
Duschka OM (1997) Query planning and optimization in information integration. Ph.D. Thesis, Computer Science Dept., Stanford University
Duschka OM, Levy A (1997) Recursive plans for information gathering. In: International Joint Conference on Artificial Intelligence
Florescu D, Koller D, Levy AY (1997) Using probabilistic information in data integration. In: Proc. Very Large Data Bases Conference, pp 216-225
Florescu D, Levy A, Manolescu I, Suciu D (1999) Query optimization in the presence of limited access patterns. In: ACM SIGMOD International Conference on Management of Data, pp 311-322
Gaifman H, Mairson HG, Sagiv Y, Vardi MY (1993) Undecidable optimization problems for database logic programs. J ACM 40(3):683-713
Grädel E, Gurevich Y, Hirch C (1998) The complexity of query reliability. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 227-234
Gupta A, Sagiv Y, Ullman JD, Widom J (1994) Constraint checking with partial information. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 45-55
Haas LM, Kossmann D, Wimmers EL, Yang J (1997) Optimizing queries across diverse data sources. In: Proc. of Very Large Data Bases Conference, pp 276-285
IMDB. The Internet Movie Database Ltd. Search Engine. http://www.imdb.com/search/
Ioannidis Y (1985) A time bound on the materialization of some recursively defined views. In: Pirotte A, Vassiliou Y (eds) Proc. Very Large Data Bases Conference. Morgan Kaufmann, San Francisco, pp 219-226
Ives Z, Florescu D, Friedman M, Levy A, Weld D (1999) An adaptive query execution engine for data integration. In: ACM SIGMOD International Conference on Management of Data, pp 299-310
Johnson DS, Klug AC (1983) Optimizing conjunctive queries that contain untyped variables. SIAM J Comput 12(4):616-640
Kifer M, Li A (1988) On the semantics of rule-based expert systems with uncertainty. In: Gyssens M, Paredaens J, Gucht DV (eds) International Conference on Database Theory, Lecture Notes in Computer Science, vol. 326. Springer, Berlin Heidelberg New York, pp 102-117
Levy a (1996) Obtaining complete answers from incomplete databases. In: Proc. Very Large Data Bases Conference, pp 402-412
Levy A, Mendelzon AO, Sagiv Y, Srivastava D (1995) Answering queries using views. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 95-104
Levy A, Rajaraman A, Ordille JJ (1996) Querying heterogeneous information sources using source descriptions. In: Proc. Very Large Data Bases Conference, pp 251-262
Li C (1999) Computing complete answers to queries in the presence of limited access patterns (extended version). Technical report, Computer Science Dept., Stanford University. http://dbpubs.stanford.edu:8090/pub/1999-11
Li C, Chang E (2000) Query planning with limited source capabilities. In: International Conference on Data Engineering, pp 401-412
Li C, Chang E (2001) Answering queries with useful bindings. ACM Trans Database Syst 26(3):313-343
Li C, Chang E (2001) On answering queries in the presence of limited access patterns. In: International Conference on Database Theory, pp 99-113
Li C, Yerneni R, Vassalos V, Garcia-Molina H, Papakonstantinou Y, Ullman JD, Valiveti M (1998) Capability based mediation in TSIMMIS. In: ACM SIGMOD International Conference on Management of Data, pp 564-566
Maher MJ (1993) A logic programming view of CLP. In: International Conference on Logic Programming, pp 737-753
Mecca G, Atzeni P, Masci A, Merialdo P, Sindoni G (1998) The Araneus web-base management system. In: ACM SIGMOD International Conference on Management of Data, pp 544-546
Mendelzon AO, Mihaila GA (2001) Querying partially sound and complete data sources. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database
Morris KA (1988) An algorithm for ordering subgoals in NAIL! In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 82-88
Naughton JF, Sagiv Y (1987) A decidable class of bounded recursions. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 227-236
Ono K, Lohman GM (1990) Measuring the complexity of join enumeration in query optimization. In: Proc. Very Large Data Bases Conference, pp 314-325, Morgan Kaufmann, San Francisco
Qian X (1996) Query folding. In: International Conference on Data Engineering, pp 48-55
Rajaraman A, Sagiv Y, Ullman JD (1995) Answering queries using templates with binding patterns. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 105-112
Sagiv Y (1985) On computing restricted projections of representative instances. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 171-180
Sagiv Y, Yannakakis M (1980) Equivalences among relational expressions with the union and difference operators. J ACM 27(4):633-655
Saraiya Y (1991) Subtree elimination algorithms in deductive databases. Ph.D. Thesis, Computer Science Dept., Stanford University
Shmueli O (1993) Equivalence of datalog queries is undecidable. J Logic Program 15(3):231-241
Swami AN, Gupta A (1988) Optimization of large join queries. In: ACM SIGMOD International Conference on Management of Data, pp 8-17
Ullman JD (1989) Principles of database and knowledge-base systems, vol. 2. The new technologies. Computer Science, New York
Ullman JD, Vardi MY (1988) The complexity of ordering subgoals. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database, pp 74-81
Yan LL, Özsu MT, Liu L (1997) Accessing heterogeneous data through homogenization and integration mediators. In: International Conference on Cooperative Information Systems, pp 130-139
Yerneni R, Li C, Garcia-Molina H, Ullman JD (1999) Computing capabilities of mediators. In: ACM SIGMOD International Conference on Management of Data, pp 443-454
Yerneni R, Li C, Ullman JD, Garcia-Molina H (1999) Optimizing large join queries in mediation systems. In: International Conference on Database Theory, pp 348-364
Zhang X, Ozsoyoglu M (1993) On efficient reasoning with implication constraints. In: Deductive and Object-Oriented Databases, pp 236-252
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 6 December 2001, Accepted: 25 November 2002, Published online: 3 April 2003
Chen Li: This article combines and integrates some content in the technical report at Stanford University [25] and the paper presented in the 8th International Conference on Database Theory (ICDT), London, UK, January, 2001 [28]. In addition to the prior materials, this article contains more results and complete proofs that were not included in the original reports.
Rights and permissions
About this article
Cite this article
Li, C. Computing complete answers to queries in the presence of limited access patterns. VLDB 12, 211–227 (2003). https://doi.org/10.1007/s00778-002-0085-6
Issue Date:
DOI: https://doi.org/10.1007/s00778-002-0085-6