Abstract
Locality sensitive hashing (LSH) is the most popular algorithm for approximate nearest neighbor (ANN) search. As LSH partitions vector space uniformly and the distribution of vectors is usually non-uniform, it poorly fits real dataset and has limited performance. In this paper, we propose a new data-dependent LSH algorithm, which has two-level structures to perform ANN search in high dimensional spaces. In the first level, we first train a number of cluster centers, then use the cluster centers to divide the dataset into many clusters and the vectors in each cluster has near uniform distribution. In the second level, we construct LSH tables for each cluster. Given a query, we first determine a few clusters that it belongs to with high probability, and then perform ANN search in the corresponding LSH tables. Experimental results on the reference datasets show that the search speed can be increased by 48 times compared to E2LSH, while keeping high search precision.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Wan, J., Tang, S., Zhang, Y., Huang, L., Li, J.: Data Driven Multi-Index Hashing. In: IEEE International Conference on Image Processing (2013)
Zezula, P., Amato, G., Dohnal, V., et al.: Similarity Search: The metric space approach. Advances in Database Systems (2006)
Adonis, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Symposium on Foundations of Computer Science (2006)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Symposium on Computational Geometry (2004)
Jegou, H., Amsaleg, L., Schmid, C., Gro, P.: Query adaptative locality sensitive hashing. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2008)
Weiss, Y., Torralba, A., Fergus, R.: Spectral Hashing. In: Advances in Neural Information Processing Systems (2008)
Heo, J.-P., Lee, Y.: Spherical Hashing. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Pan, J., Manocha, D.: Bi-level Locality Sensitive Hashing for k-Nearest Neighbor Computation. In: Very Large Data Base (2010)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)
Jegou, H., Douze, M., et al.: Product Quantization for Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(1), 117–128 (2011)
Pauleve, L., Jegou, H., Amsaleg, L.: Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Elsevier B.V. (2010)
Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Very Large Data Base (2007)
Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: International Conference on World Wide Web (2005)
Dong, W., Wang, Z., Josephson, W., Charikar, M., Li, K.: Modeling lsh for performance tuning. In: Conference on Information and Knowledge Management (2008)
Babenko, A., Lempitsky, V.: The inverted multi-index. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Xie, H., Zhang, Y., Tan, J., Guo, L., Li, J.: Contextual Query Expansion for Image Retrieval. IEEE Trans. on Multimedia 16(4) (June 2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Xie, H., Chen, Z., Liu, Y., Tan, J., Guo, L. (2014). Data-Dependent Locality Sensitive Hashing. In: Ooi, W.T., Snoek, C.G.M., Tan, H.K., Ho, CK., Huet, B., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2014. PCM 2014. Lecture Notes in Computer Science, vol 8879. Springer, Cham. https://doi.org/10.1007/978-3-319-13168-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-13168-9_32
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13167-2
Online ISBN: 978-3-319-13168-9
eBook Packages: Computer ScienceComputer Science (R0)