Abstract
While cluster computing is well established, it is not clear how to co-ordinate clusters consisting of many database components in order to process high workloads. In this paper, we focus on Online Analytical Processing (OLAP) queries, i.e., relatively complex queries whose evaluation tends to be time-consuming, and we report on some observations and preliminary results of our PowerDB project in this context. We investigate how many cluster nodes should be used to evaluate an OLAP query in parallel. Moreover, we provide a classification of OLAP queries, which is used to decide, whether and how a query should be parallelized. We run extensive experiments to evaluate these query classes in quantitative terms. Our results are an important step towards a two-phase query optimizer. In the first phase, the coordination infrastructure decomposes a query into subqueries and ships them to appropriate cluster nodes. In the second phase, each cluster node optimizes and evaluates its sub-query locally.
Project partially supported by Microsoft Research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
özsu, T., Valduriez, P., Distributed and Parallel Database Systems. ACM Computing Surveys, 28(1):125–128, March 1996.
Röhm, U., Böhm, K., Schek, H.-J., OLAP Query Routing and Physical Design in a Data-base Cluster. Advances in Database Technology, In Proceedings 7th EDBT Conference, pp. 254–268, March 2000.
Röhm, U., Böhm, K., Schek, H.-J., Cache-Aware Query Routing in a Cluster of Data-bases. In Proceedings 17th IEEE ICDE Conference, April 2001.
Kossmann, D., The State of the Art in Distributed Query Processing. ACM Computing Surveys, 32(4): 422–469, September 2000.
Baru, C.K. et al., DB2 Parallel Edition. IBM System Journal, 34(2):292–322, 1995.
Oracle 8i Parallel Server. An Oracle Technical White Paper. January 20, 2000.
Delaney, K., Inside Microsoft SQL Server 2000. Microsoft Press, 2001.
Bozas, G., Jaedicke, Mitschang, B., Reiser, A. Zimmermann, S., On Transforming a Sequential SQL-DBMS into a Parallel One: First Results and Experiences of the MIDAS Project. TUM-I 9625, SFB-Bericht Nr. 342/14/96 A, May 1996.
DeWitt, D.J., Gray, J., Parallel Database Systems: The Future of High Performance Data-base Systems. Communications of the ACM, 35(6):85–98, June 1992.
DeWitt, D.J., et al., The Gamma Database Machine Project. IEEE Transactions on Knowledge and Data Engineering, 2(1):44–62, March 1990.
Boral, H., et. al., Prototyping Bubba, A Highly Parallel Database System. IEEE Transac-tions on Knowledge and Data Engineering, 2(1):4–24, March 1990.
Stonebraker, M., et. al., The Design of XPRS. In Proceedings 14th VLDB Conference, pp. 318–330, September 1988.
Graefe, G., Volcano— An Extensible and Parallel Query Evaluation System. IEEE Trans-actions on Knowledge and Data Engineering, 6(1):120–135, February 1994.
Exbrayat, M., Brunie, L., A PC-NOW based parallel extension for a sequential DBSM. In Proceedings IPDPS 2000 Conference, Cancun, Mexico, 2000.
Tamura, T., Oguchi, M., Kitsuregawa, M., Parallel Database Processing on a 100 node PC Cluster: Cases for Decision Support Query Processing and Data Mining. In Proceedings SC’97 Conference: High Performance Networking and Computing, 1997.
Ganski, R.A., Long, H.K.T. Optimization of Nested SQL Queries Revisited. In Proceedings ACM SIGMOD Conference, pp. 23–33, 1987.
Röhm, U., Böhm, K., Schek, H.-J., Schuldt, H., FAS — A Freshness-Sensitive Coordina-tion Middleware for a Cluster of OLAP Components. In Proceedings 28 th VLDB Conference, 2002.
Shatdal, A., Naughton, J.F., Adaptive Parallel Aggregation Algorithms. In Proceedings ACM SIGMOD Conference, pp. 104–114, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Akal, F., Böhm, K., Schek, HJ. (2002). OLAP Query Evaluation in a Database Cluster: A Performance Study on Intra-Query Parallelism. In: Manolopoulos, Y., Návrat, P. (eds) Advances in Databases and Information Systems. ADBIS 2002. Lecture Notes in Computer Science, vol 2435. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45710-0_18
Download citation
DOI: https://doi.org/10.1007/3-540-45710-0_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44138-0
Online ISBN: 978-3-540-45710-7
eBook Packages: Springer Book Archive