Abstract
In content-based information retrieval (CBIR) of multimedia data, high-dimensional data indexing and query is a challenging problem due to the inherent high dimensionality of multimedia data. As a data-based method, metric distance based high-dimensional data indexing has recently emerged as an attractive method because of its ability of making use of the properties of metric spaces to improve the efficiency and effectiveness of data indexing. M-tree is one of the most efficient indexing structures for searching data from metric space, and it is a paged, balanced, and dynamic tree that organizes data objects in an arbitrary metric space with fixed sizes for all its nodes. However, inherent disadvantages are veiled in the M-tree and its variants, which prevent them from further improvement of their indexing and query efficiency. To avoid these disadvantages, this paper proposes a sorted clue tree (SC-tree), which essentially modifies the nodes, entries, indexing algorithm, and query algorithm of the M-tree but reserves its advantages. Experimental results and complexity analyses have shown that the SC-tree is much more efficient than the M-tree with respect to the query time and indexing time without sacrificing its query accuracy.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Aggarwal, C.C., Procopiuc, C., Wolf, J.L., Yu, P.S., Park, J.S.: Fast algorithms for projected clustering. In: Proc. of the ACM SIGMOD Conference, Philadelphia, USA, pp. 61–72 (1999)
Bartolini, I., Ciaccia, P., Patella, M.: String matching with metric trees using an approximate distance. In: Proc. of the 9th Int. Symposium on String Processing and Information Retrieval (SPIRE), Lisbon, Portugal, pp. 271–283 (2002)
Beckmann, N., Kriegel, H.P., Schneider, R., Seeger, B.: The R*-tree: An efficient and robust access method for points and rectangles. In: Proc. of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, pp. 322–331 (1990)
Berchtold, S., Keim, D.A., Kriegel, H.P.: The X-tree: An index structure for high-dimensional data. In: Proc. 22nd Int. Conference on Very Large DataBases (VLDB), Bombay, India, pp. 28–39 (1996)
Bozkaya, T., Ozsoyoglu, M.: Distance-based indexing for high-dimensional metric spaces. In: Proc. of ACM SIGMOD, Tucson, USA, pp. 357–368 (1997)
Brin, S.: Near neighbor search in large metric spaces. In: Proc. 21st Int. Conference on Very Large DataBases (VLDB), San Francisco, USA, pp. 574–584 (1995)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proc. Int. Conference of VLDB, Athens, Greece, pp. 522–525 (1997)
Ciaccia, P., Patella, M., Zezula, P.: Processing complex similarity queries with distance-based access methods. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 9–13. Springer, Heidelberg (1998)
Ciaccia, P., Patella, M.: PAC nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces. In: Proc. of the 16th Int. Conference on Data Engineering (ICDE), California, USA, pp. 244–255 (2000)
Ciaccia, P., Patella, M.: Searching in metric spaces with user-defined and approximate distances. ACM Transactions on Database Systems 27, 398–437 (2002)
Ciaccia, P., Nanni, A., Patella, M.: A query-sensitive cost model for similarity queries with M-tree. In: Proc. of the 10th Australasian Database Conference (ADC), New Zealand, pp. 65–76 (1999)
Finkel, R., Bentley, J.: Quad-trees: A data structure for retrieval on composite keys. ACTA Informatica 4, 1–9 (1974)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proc. of ACM SIGMOD, Boston, USA, pp. 47–57 (1984)
Heisterkamp, D.R., Peng, J.: A kernel vector approximation file for nearest neighbor search using kernel methods. In: Proc. of the 6th Kernel Machines Workshop at Neural Information Processing Systems Conference, Whistler, Canada, pp. 1–12 (2002)
McNames, J.: A fast nearest neighbor algorithm based on a principal axis search tree. IEEE Transactions on Pattern Analysis and Intelligence 23, 964–976 (2001)
Nievergelt, J., Hinterberger, H., Sevcik, K.C.: The grid file: An adaptable, symmetric multikey file structure. ACM Trans. on Database Systems 9, 38–71 (1984)
Robinson, J.: The KDB-tree: A search structure for large multidimensional dynamic indexes. In: Proc. of the ACM SIGMOD Int. Conference on Management of Data, Ann Arbor, Michigan, pp. 10–18 (1981)
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Information Processing Letters 40, 175–179 (1991)
Zezula, P., Savino, P., Amato, G., Rabitti, F.: Approximate similarity retrieval with M-trees. VLDB Journal 7, 275–293 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, B., Gan, J.Q. (2006). SC-Tree: An Efficient Structure for High-Dimensional Data Indexing. In: Bell, D.A., Hong, J. (eds) Flexible and Efficient Information Handling. BNCOD 2006. Lecture Notes in Computer Science, vol 4042. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11788911_14
Download citation
DOI: https://doi.org/10.1007/11788911_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35969-2
Online ISBN: 978-3-540-35971-5
eBook Packages: Computer ScienceComputer Science (R0)