Abstract
The TPC-D benchmark was developed almost 20 years ago, and even though its current existence as TPC-H could be considered superseded by TPC-DS, one can still learn from it. We focus on the technical level, summarizing the challenges posed by the TPC-H workload as we now understand them, which we call “choke points”. We identify 28 different such choke points, grouped into six categories: Aggregation Performance, Join Performance, Data Access Locality, Expression Calculation, Correlated Subqueries and Parallel Execution. On the meta-level, we make the point that the rich set of choke-points found in TPC-H sets an example on how to design future DBMS benchmarks.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Huppler, K.: The art of building a good benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 18–30. Springer, Heidelberg (2009)
Nambiar, R.O., Poess, M.: The making of TPC-DS. In: VLDB, pp. 1049–1058 (2006)
Simmen, D.E., Shekita, E.J., Malkemus, T.: Fundamental techniques for order optimization. In: Jagadish, H.V., Mumick, I.S. (eds.) Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, pp. 57–67. ACM Press (1996)
Moerkotte, G., Neumann, T.: Accelerating queries with group-by and join by groupjoin. PVLDB 4, 843–851 (2011)
Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25, 73–170 (1993)
Neumann, T., Weikum, G.: Scalable join processing on very large rdf graphs. In: Proceedings of the 35th SIGMOD International Conference on Management of Data, pp. 627–640. ACM (2009)
Rao, J., Lindsay, B., Lohman, G., Pirahesh, H., Simmen, D.: Using EELs: A practical approach to outerjoin and antijoin reordering. In: ICDE, pp. 595–606 (2001)
Moerkotte, G., Neumann, T.: Dynamic programming strikes back. In: SIGMOD Conference, pp. 539–552 (2008)
Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: Cords: Automatic discovery of correlations and soft functional dependencies. In: SIGMOD Conference, pp. 647–658 (2004)
Moerkotte, G.: Small materialized aggregates: A light weight index structure for data warehousing. In: VLDB, pp. 476–487 (1998)
Zukowski, M., Nes, N., Boncz, P.A.: DSM vs. NSM: Cpu performance tradeoffs in block-oriented query processing. In: DaMoN, pp. 47–54 (2008)
Abadi, D.J.: Query execution in column-oriented database systems. MIT PhD Dissertation (2008) PhD Thesis
Abadi, D.J., Madden, S., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOD Conference, pp. 967–980 (2008)
Li, Q., Shao, M., Markl, V., Beyer, K.S., Colby, L.S., Lohman, G.M.: Adaptively reordering joins during query execution. In: ICDE, pp. 26–35 (2007)
Seshadri, P., Pirahesh, H., Leung, T.Y.C.: Complex query decorrelation. In: ICDE, pp. 450–458 (1996)
Neumann, T., Moerkotte, G.: A framework for reasoning about share equivalence and its integration into a plan generator. In: BTW, pp. 7–26 (2009)
Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core cpus: Tuning to the underlying hardware. In: ICDE (2013)
Nagel, F., Boncz, P., Viglas, S.D.: Recycling in pipelined query evaluation. In: ICDE (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Boncz, P., Neumann, T., Erling, O. (2014). TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark. In: Nambiar, R., Poess, M. (eds) Performance Characterization and Benchmarking. TPCTC 2013. Lecture Notes in Computer Science, vol 8391. Springer, Cham. https://doi.org/10.1007/978-3-319-04936-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-04936-6_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04935-9
Online ISBN: 978-3-319-04936-6
eBook Packages: Computer ScienceComputer Science (R0)