Abstract
Extraction–Transform–Load (ETL) processes comprise complex data workflows, which are responsible for the maintenance of a Data Warehouse. A plethora of ETL tools is currently available constituting a multi-million dollar market. Each ETL tool uses its own technique for the design and implementation of an ETL workflow, making the task of assessing ETL tools extremely difficult. In this paper, we identify common characteristics of ETL workflows in an effort of proposing a unified evaluation method for ETL. We also identify the main points of interest in designing, implementing, and maintaining ETL workflows. Finally, we propose a principled organization of test suites based on the TPC-H schema for the problem of experimenting with ETL workflows.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Ab Initio (2009), http://www.abinitio.com/
Adzic, J., Fiore, V.: Data Warehouse Population Platform. In: DMDW (2003)
Briand, L.C., Morasca, S., Basili, V.R.: Property-Based Software Engineering Measurement. IEEE Trans. on Software Engineering 22(1) (1996)
Carey, M.J., DeWitt, D.J., Naughton, J.F.: The OO7 Benchmark. In: SIGMOD (1993)
Carey, M.J., et al.: The BUCKY Object-Relational Benchmark. In: SIGMOD (1997)
Dayal, U., Castellanos, M., Simitsis, A., Wilkinson, K.: Data Integration Flows for Business Intelligence. In: EDBT (2009)
IBM, IBM InfoSphere Information Server (2009), http://www-01.ibm.com/software/data/integration/info_server_platform/
Informatica, PowerCenter (2009), http://www.informatica.com/products/powercenter/
Microsoft. SQL Server Integration Services (SSIS) (2009), http://technet.microsoft.com/en-us/sqlserver/bb331782.aspx
Oracle, Oracle Warehouse Builder 11g (2009), http://www.oracle.com/technology/products/warehouse/
Othayoth, R., Poess, M.: The Making of TPC-DS. In: VLDB (2006)
Simitsis, A., Vassiliadis, P., Skiadopoulos, S., Sellis, T.: Data Warehouse Refreshment. In: Data Warehouses and OLAP: Concepts, Architectures and Solutions. IRM Press (2006)
Simitsis, A., Wilkinson, K., Castellanos, M., Dayal, U.: QoX-Driven ETL Design: Reducing the Cost of the ETL Consulting Engagements. In: SIGMOD (2009)
TPC. TPC Benchmark Status. TPC-ETL (2009), http://www.tpc.org/reports/status/
TPC. TPC-H benchmark. Transaction Processing Council (2009), http://www.tpc.org/
Vassiliadis, P., Karagiannis, A., Tziovara, V., Simitsis, A.: Towards a Benchmark for ETL Workflows. In: QDB (2007), http://www.cs.uoi.gr/~pvassil/publications/publications.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Simitsis, A., Vassiliadis, P., Dayal, U., Karagiannis, A., Tziovara, V. (2009). Benchmarking ETL Workflows. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2009. Lecture Notes in Computer Science, vol 5895. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10424-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-10424-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10423-7
Online ISBN: 978-3-642-10424-4
eBook Packages: Computer ScienceComputer Science (R0)