Abstract
Pushing data-intensive analytics down to database engines is the key to high-performance and secured execution; however, the existent SQL framework is unable to express general graph-based dataflow processes, and unable to orchestrate multiple dataflow processes with inter-operation data dependencies.
In this work we extend SQL to Functional Form-SQL (FF-SQL) based on a calculus of queries, to declaratively express complex dataflow graphs. A FF-SQL query is constructed from conventional queries using Function Forms (FFs). While a conventional SQL query represents a dataflow tree, a FF-SQL query represents a more general dataflow graph. Further, with FF-SQL, a group of SQL dataflow processes with data dependency among their operations can be specified as a single, integrated FF-SQL definition, and executed cooperatively inside the database engine without repeated data retrieval, duplicated computation and unnecessary data copying. A novel extension to the PostgreSQL query engine is made to support FF-SQL dataflow processes.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Backus, J.: Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs. ACM Turing award lecture (1977)
Cao, Y., Das, G.C., Chan, C.-Y., Tan, K.-L.: Optimizing Complex Queries with Multiple Relation Instances. In: ACM SIGMOD 2008 (2008)
Chen, Q., Hsu, M.: Data-Continuous SQL Process Model. In: Proc. 16th International Conference on Cooperative Information Systems, CoopIS 2008 (2008)
Chen, Q., Hsu, M., Liu, R.: Extend UDF Technology for Integrated Analytics. In: Proc. 10th Int. Conf. on Data Warehousing and Knowledge Discovery, DaWaK 2009 (2009)
DeWitt, D.J., Paulson, E., Robinson, E., Naughton, J., Royalty, J., Shankar, S., Krioukov, A.: Clustera: An Integrated Computation And Data Management System. In: VLDB 2008 (2008)
Tao, Y., Zhu, Q., Zuzarte, C.: Exploring Common Subqueries for Complex Query Optimization. In: IBM Centre for Advanced Studies Conference (2002)
Zukowski, M., Héman, S., Nes, N., Boncz, P.: Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS. In: VLDB (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, Q., Hsu, M. (2009). Cooperating SQL Dataflow Processes for In-DB Analytics. In: Meersman, R., Dillon, T., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2009. OTM 2009. Lecture Notes in Computer Science, vol 5870. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05148-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-05148-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05147-0
Online ISBN: 978-3-642-05148-7
eBook Packages: Computer ScienceComputer Science (R0)