Abstract
The queries defined on data warehouses are complex and use several join operations that induce an expensive computational cost. This cost becomes even more prohibitive when queries access very large volumes of data. To improve response time, data warehouse administrators generally use indexing techniques such as star join indexes or bitmap join indexes. This task is nevertheless complex and fastidious. Our solution lies in the field of data warehouse auto-administration. In this framework, we propose an automatic index selection strategy. We exploit a data mining technique ; more precisely frequent itemset mining, in order to determine a set of candidate indexes from a given workload. Then, we propose several cost models allowing to create an index configuration composed by the indexes providing the best profit. These models evaluate the cost of accessing data using bitmap join indexes, and the cost of updating and storing these indexes.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, S., Chaudhuri, S., Narasayya, V.: Automated selection of materialized views and indexes in SQL databases. In: 26th International Conference on Very Large Data Bases (VLDB 2000), Cairo, Egypt, pp. 496–505 (2000)
Agrawal, S., Chaudhuri, S., Narasayya, V.: Materialized view and index selection tool for Microsoft SQL Server 2000. In: ACM SIGMOD International Conference on Management of Data (SIGMOD 2001), Santa Barbara, USA, p. 608 (2001)
Aouiche, K., Darmont, J., Gruenwald, L.: Frequent itemsets mining for database auto-administration. In: 7th International Database Engineering and Application Symposium (IDEAS 2003), Hong Kong, China, pp. 98–103 (2003)
Chaudhuri, S., Datar, M., Narasayya, V.: Index selection for databases: A hardness study and a principled heuristic solution. IEEE Transactions on Knowledge and Data Engineering 16(11), 1313–1323 (2004)
Feldman, Y., Reouven, J.: A knowledge–based approach for index selection in relational databases. Expert System with Applications 25(1), 15–37 (2003)
Finkelstein, S., Schkolnick, M., Tiberio, P.: Physical database design for relational databases. ACM Transactions on Database Systems 13(1), 91–128 (1988)
Frank, M., Omiecinski, E., Navathe, S.: Adaptive and automated index selection in RDBMS. In: Pirotte, A., Delobel, C., Gottlob, G. (eds.) EDBT 1992. LNCS, vol. 580, pp. 277–292. Springer, Heidelberg (1992)
Golfarelli, M., Rizzi, S., Saltarelli, E.: Index selection for data warehousing. In: 4th International Workshop on Design and Management of Data Warehouses (DMDW 2002), Toronto, Canada, pp. 33–42 (2002)
Gupta, H., Harinarayan, V., Rajaraman, A., Ullman, J.D.: Index selection for OLAP. In: 13th International Conference on Data Engineering (ICDE 1997), Birmingham, UK, pp. 208–219 (1997)
Inmon, W.: Building the Data Warehouse, 3rd edn. John Wiley & Sons, Chichester (2002)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd edn. John Wiley & Sons, Chichester (2002)
Kratica, J., Ljubić, I., Tošić, D.: A genetic algorithm for the index selection problem. In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot, A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.) EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003, EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 281–291. Springer, Heidelberg (2003)
Labio, W., Quass, D., Adelberg, B.: Physical database design for data warehouses. In: 13th International Conference on Data Engineering (ICDE 1997), Birmingham, UK, pp. 277–288 (1997)
Mishra, P., Eich, M.: Join processing in relational databases. ACM Computing Surveys 24(1), 63–113 (1992)
O’Neil, P., Quass, D.: Improved query performance with variant indexes. In: ACM SIGMOD International Conference on Management of Data (SIGMOD 1997), Tucson, USA, pp. 38–49 (1997)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Sarawagi, S.: Indexing OLAP data. Data Engineering Bulletin 20(1), 36–43 (1997)
Valentin, G., Zuliani, M., Zilio, D., Lohman, G., Skelley, A.: DB2 advisor: An optimizer smart enough to recommend its own indexes. In: 16th International Conference on Data Engineering (ICDE 2000), San Diego, USA, pp. 101–110 (2000)
Wu, M.: Query optimization for selections using bitmaps. In: ACM SIGMOD International Conference on Management of Data (SIGMOD 1999), Philadelphia, USA, pp. 227–238 (1999)
Wu, M., Buchmann, A.: Encoded bitmap indexing for data warehouses. In: 14th International Conference on Data Engineering (ICDE 1998), Orlando, USA, pp. 220–230 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aouiche, K., Darmont, J., Boussaïd, O., Bentayeb, F. (2005). Automatic Selection of Bitmap Join Indexes in Data Warehouses. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2005. Lecture Notes in Computer Science, vol 3589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11546849_7
Download citation
DOI: https://doi.org/10.1007/11546849_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28558-8
Online ISBN: 978-3-540-31732-6
eBook Packages: Computer ScienceComputer Science (R0)