An Indexing Structure for Dynamic Multidimensional Data in Vector Space

Mikhaylova, Elena; Novikov, Boris; Volokhov, Anton

doi:10.1007/978-3-642-32741-4_17

Elena Mikhaylova⁴,
Boris Novikov⁴ &
Anton Volokhov⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 186))

867 Accesses

Abstract

The multidimensional k – NN (k nearest neighbors) query problem is relevant to a large variety of database applications, including information retrieval, natural language processing, and data mining. To solve it efficiently, the database needs an indexing structure that provides this kind of search. However, attempts to find an exact solution are hardly feasible in multidimensional space. In this paper, a novel indexing technique for the approximate solution of k – NN problem is described and analyzed. The construction of the indexing tree is based on clustering. Indexing structure is implemented on top of high-performance industrial DBMS.

This research is supported by Russian Foundation for Basic Research, grant 10-07-00156.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

PL-Tree: An Efficient Indexing Method for High-Dimensional Data

The Design Perspective of the Structures Based on k-d Tree

High-dimensional similarity searches using query driven dynamic quantization and distributed indexing

Article 11 April 2019

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008), http://doi.acm.org/10.1145/1327452.1327494 , doi:10.1145/1327452.1327494
Article Google Scholar
Barton, S., Gouet-Brunet, V., Rukoz, M.: Large scale disk-based metric indexing structure for approximate information retrieval by content. In: Proceedings of the 1st Workshop on New Trends in Similarity Search, NTSS 2011, pp. 2–7. ACM, New York (2011), http://doi.acm.org/10.1145/1966865.1966869 , doi:10.1145/1966865.1966869
Chapter Google Scholar
Chen, J., Fang, H.R., Saad, Y.: Fast approximate knn graph construction for high dimensional data via recursive lanczos bisection. J. Mach. Learn. Res. 10, 1989–2012 (2009), http://dl.acm.org/citation.cfm?id=1577069.1755852
MathSciNet MATH Google Scholar
Ciaccia, P., Patella, M.: Bulk loading the m-tree. In: Proceedings of the 9th Australasian Database Conference, ADC 1998, pp. 15–26 (1998)
Google Scholar
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB 1997, pp. 426–435. Morgan Kaufmann Publishers Inc., San Francisco (1997), http://dl.acm.org/citation.cfm?id=645923.671005
Google Scholar
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, SCG 2004, pp. 253–262. ACM, New York (2004), http://doi.acm.org/10.1145/997817.997857 , doi:10.1145/997817.997857
Chapter Google Scholar
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases, VLDB 1999, pp. 518–529. Morgan Kaufmann Publishers Inc., San Francisco (1999), http://dl.acm.org/citation.cfm?id=645925.671516
Google Scholar
Gudmundsson, G.T., Jónsson, B.T., Amsaleg, L.: A large-scale performance study of cluster-based high-dimensional indexing. In: Proceedings of the International Workshop on Very-Large-Scale Multimedia Corpus, Mining and Retrieval, VLS-MCMR 2010, pp. 31–36. ACM, New York (2010), http://doi.acm.org/10.1145/1878137.1878145 , doi:10.1145/1878137.1878145
Chapter Google Scholar
Günnemann, S., Kremer, H., Lenhard, D., Seidl, T.: Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT/ICDT 2011, pp. 237–248. ACM, New York (2011), http://doi.acm.org/10.1145/1951365.1951395 , doi:10.1145/1951365.1951395
Chapter Google Scholar
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, SIGMOD 1984, pp. 47–57. ACM, New York (1984), http://doi.acm.org/10.1145/602259.602266 , doi:10.1145/602259.602266
Chapter Google Scholar
Guttman, A.: R-trees: a dynamic index structure for spatial searching. SIGMOD Rec. 14(2), 47–57 (1984), http://doi.acm.org/10.1145/971697.602266 , doi:10.1145/971697.602266
Article Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC 1998, pp. 604–613. ACM, New York (1998), http://doi.acm.org/10.1145/276698.276876 , doi:10.1145/276698.276876
Chapter Google Scholar
Moreno-Seco, F., Milcó, L., Oncina, J.: A modification of the laesa algorithm for approximated k-nn classification. Pattern Recogn. Lett. 24(1-3), 47–53 (2003), http://dx.doi.org/10.1016/S0167-86550200187-3 , doi:10.1016/S0167-8655(02)00187-3
Article MATH Google Scholar
Radovanović, M., Nanopoulos, A., Ivanović, M.: Hubs in space: Popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11, 2487–2531 (2010), http://dl.acm.org/citation.cfm?id=1756006.1953015
MathSciNet MATH Google Scholar
Sharifzadeh, M., Shahabi, C.: Vor-tree: R-trees with voronoi diagrams for efficient processing of spatial nearest neighbor queries. Proc. VLDB Endow. 3(1-2), 1231–1242 (2010), http://dl.acm.org/citation.cfm?id=1920841.1920994
Google Scholar
Shen, H.T., Zhou, X., Zhou, A.: An adaptive and dynamic dimensionality reduction method for high-dimensional indexing. The VLDB Journal 16(2), 219–234 (2007), http://dx.doi.org/10.1007/s00778-005-0167-3 , doi:10.1007/s00778-005-0167-3
Article Google Scholar
Skopal, T.: Where are you heading, metric access methods?: a provocative survey. In: Proceedings of the Third International Conference on SImilarity Search and Applications, SISAP 2010, pp. 13–21. ACM, New York (2010), http://doi.acm.org/10.1145/1862344.1862347 , doi:10.1145/1862344.1862347
Chapter Google Scholar
Skopal, T., Pokorný, J., Krátký, M., Snášel, V.: Revisiting M-Tree Building Principles. In: Kalinichenko, L.A., Manthey, R., Thalheim, B., Wloka, U. (eds.) ADBIS 2003. LNCS, vol. 2798, pp. 148–162. Springer, Heidelberg (2003)
Chapter Google Scholar
Thomasian, A., Zhang, L.: Persistent clustered main memory index for accelerating k-nn queries on high dimensional datasets. Multimedia Tools Appl. 38(2), 253–270 (2008), http://dx.doi.org/10.1007/s11042-007-0179-7 , doi:10.1007/s11042-007-0179-7
Article Google Scholar
Traina Jr., C., Traina, A.J.M., Seeger, B., Faloutsos, C.: Slim-Trees: High Performance Metric Trees Minimizing Overlap between Nodes. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 51–65. Springer, Heidelberg (2000), http://dl.acm.org/citation.cfm?id=645339.650146
Chapter Google Scholar
Vidal, E.: New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (aesa). Pattern Recognition Letters 15(1), 1–7 (1994)
Article MathSciNet Google Scholar
Xu, J., Zheng, B., Lee, W.C., Lun Lee, D.: The d-tree: An index structure for planar point queries in location-based wireless services. IEEE Trans. on Knowl. and Data Eng. 16(12), 1526–1542 (2004), http://dx.doi.org/10.1109/TKDE.2004.97 , doi:10.1109/TKDE.2004.97
Article Google Scholar

Download references

Author information

Authors and Affiliations

Saint Petersburg University, Petersburg, Russia
Elena Mikhaylova, Boris Novikov & Anton Volokhov

Authors

Elena Mikhaylova
View author publications
You can also search for this author in PubMed Google Scholar
Boris Novikov
View author publications
You can also search for this author in PubMed Google Scholar
Anton Volokhov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elena Mikhaylova .

Editor information

Editors and Affiliations

Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 3a, Poznan, 60-965, Poland
Tadeusz Morzy
, Dept. of Computer Science, University of Kaiserslautern, Kaiserslautern, 67653, Germany
Theo Härder
Institute of Computing Science, Faculty of Computing, Poznań University of Technology, Piotrowo 2, Poznań, 60-965, Poland
Robert Wrembel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mikhaylova, E., Novikov, B., Volokhov, A. (2013). An Indexing Structure for Dynamic Multidimensional Data in Vector Space. In: Morzy, T., Härder, T., Wrembel, R. (eds) Advances in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol 186. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32741-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-32741-4_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32740-7
Online ISBN: 978-3-642-32741-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

An Indexing Structure for Dynamic Multidimensional Data in Vector Space

Abstract

Chapter PDF

Similar content being viewed by others

PL-Tree: An Efficient Indexing Method for High-Dimensional Data

The Design Perspective of the Structures Based on k-d Tree

High-dimensional similarity searches using query driven dynamic quantization and distributed indexing

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Indexing Structure for Dynamic Multidimensional Data in Vector Space

Abstract

Chapter PDF

Similar content being viewed by others

PL-Tree: An Efficient Indexing Method for High-Dimensional Data

The Design Perspective of the Structures Based on k-d Tree

High-dimensional similarity searches using query driven dynamic quantization and distributed indexing

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation