Multi-attention and Multi-layer Hashing for Cross-Modal Retrieval

Wang, Zhiyou; Li, Meijing; Chen, Tianjie

doi:10.1007/978-981-16-8430-2_13

Zhiyou Wang⁴⁰,
Meijing Li⁴⁰ &
Tianjie Chen⁴⁰

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 833))

Included in the following conference series:

International Conference on Genetic and Evolutionary Computing

637 Accesses

Abstract

The research on cross-modal retrieval has broadened the access to various forms of data resources, among which cross-modal hashing methods have gained widespread attention owing to their excellent effects. However, existing hashing-based methods cannot to establish a deep inter-model correlation and fully employ the semantic information at the same time, and single-layer hashing may cause the hash representations not robust enough. To deal with above issues, we come up with a new method with multi-attention and multi-layer hashing to achieve the goal of mutual retrieval between different forms of data. Firstly, we apply the modal attention of multi-attention to capture the bit-level dependencies between cross-modal features to build a deeper inter-modal correlation. Meanwhile, the part of semantic attention is applied to retain common semantic information, which will improve the accuracy of retrieval tasks. Secondly, to learn a more robust hash representations, multi-layer hashing is used to jointly complete the learning task of hash representations and avoid the shortcomings of single-layer hashing. Through a series of experiments on two cross-modal datasets, it is indicated that the model we presented is better than the general approach in most evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-attention based semantic deep hashing for cross-modal retrieval

Article 20 January 2021

Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval

Deep semantic hashing with dual attention for cross-modal retrieval

Article 12 November 2021

References

Hu, Y., Zheng, L., Yang, Y., Huang, Y.: Twitter100k: a real-world dataset for weakly supervised cross-media retrieval. IEEE Trans. Multimedia 20(4), 927–938 (2017)
Article Google Scholar
Huang, X., Peng, Y.: TPCKT: two-level progressive cross-media knowledge transfer. IEEE Trans. Multimedia 21(11), 2850–2862 (2019)
Article Google Scholar
Peng, Y., Qi, J., Huang, X., Yuan, Y.: CCL: cross-modal correlation learning with multi-grained fusion by hierarchical network. IEEE Trans. Multimedia 20(2), 405–420 (2017)
Article Google Scholar
Li, C., Liu, Z., Li, S., Lin, Z., Tian, L.: Variable length deep cross-modal hashing based on Cauchy probability function. Wireless Netw. 2, 1–11 (2020)
Google Scholar
Liu, X., Cheung, Y.M., Hu, Z., He, Y., Zhong, B.: Adversarial tri-fusion hashing network for imbalanced cross-modal retrieval. IEEE Trans. Emerg. Top. Comput. Intell. 5(4), 607–619 (2021)
Article Google Scholar
Zhu, L., Tian, G., Wang, B., Wang, W., Zhang, D., Li, C.: Multi-attention based semantic deep hashing for cross-modal retrieval. Appl. Intell. 51(8), 5927–5939 (2021)
Article Google Scholar
Hotelling, H.: Relations between two sets of variates. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics. Springer Series in Statistics (Perspectives in Statistics). Springer, New York (1992). https://doi.org/10.1007/978-1-4612-4380-9_14
Lin, Y., Zheng, Z., Zhang, H., Gao, C., Yang, Y.: Bayesian query expansion for multi-camera person re-identification. Pattern Recogn. Lett. 130, 284–292 (2020)
Article Google Scholar
Han, Y., Fei, W., Jian, S., Qi, T., Zhuang, Y.: Graph-guided sparse reconstruction for region tagging. In: Computer Vision & Pattern Recognition, pp. 2981–2988. IEEE, Providence, RI, USA (2012)
Google Scholar
Jiang, Q.Y., Li, W.J.: Deep cross-modal hashing. In: 2017 IEEE Conference on Computer Vision & Pattern Recognition, pp. 3270–3278. IEEE, Honolulu, HI, USA (2017)
Google Scholar
Li, C., Deng, C., Li, N., Liu, W., Gao, X., Tao, D.: Self-supervised adversarial hashing networks for cross-modal retrieval. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4242–4251. IEEE, Salt Lake City, UT, USA (2018)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial networks. Adv. Neural. Inf. Process. Syst. 3, 2672–2680 (2014)
Google Scholar
Gu, W., Gu, X., Gu, J., Li, B., Xiong, Z., Wang, W.: Adversary Guided Asymmetric Hashing for Cross-Modal Retrieval. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval, pp. 159–167. Association for Computing Machinery, Ottawa ON, Canada (2019).
Google Scholar
Zhang, H., Pan, M.: Semantics-preserving hashing based on multi-scale fusion for cross-modal retrieval. Multimedia Tools Appl. 80(11), 17299–17314 (2020). https://doi.org/10.1007/s11042-020-09869-4
Article Google Scholar
Zhang, X., Lai, H., Feng, J.: Attention-aware deep adversarial hashing for cross-modal retrieval. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 614–629. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_36
Chapter Google Scholar
He, J., Chen, S., Wang, Y., Qiao, Y.: Dual-supervised attention network for deep cross-modal hashing. Pattern Recogn. Lett. 128, 333–339 (2019)
Google Scholar
Deng, J.: ImageNet: A Large-Scale Hierarchical Image Database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE, Miami, FL, USA (2009).
Google Scholar
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. Comput. Sci. (2014)
Google Scholar
Qiang, H., Wan, Y., Xiang, L., Meng, X.: Deep semantic similarity adversarial hashing for cross-modal retrieval. Neurocomputing 400, 24–33 (2020)
Google Scholar
Huiskes, M.J, Lew, M.S.: The MIR flickr retrieval evaluation. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pp. 39–43. Association for Computing Machinery, Vancouver, British Columbia, Canada (2008)
Google Scholar
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z.: NUS-WIDE: a real-world web image database from national university of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 1–9. Association for Computing Machinery, Santorini, Fira, Greece (2009)
Google Scholar
Xzab, D., Xw, A., Emb, C., Song, W.A.: Multi-label semantics preserving based deep cross-modal hashing. Signal Process. Image Commun. 93, 116–131 (2021)
Google Scholar
Kumar, R.S.: Learning hash functions for cross-view similarity search. In: Proceedings of the 23th International Joint Conference on Artificial Intelligence, pp. 1360–1365. AAAI Press, Barcelona, Catalonia, Spain (2012)
Google Scholar
Wang, D., Gao, X., Wang, X., He, L.: Semantic topic multimodal hashing for cross-media retrieval. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 3890–3896. AAAI Press, Buenos Aires, Argentina (2015)
Google Scholar
Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3594–3601. IEEE, San Francisco, CA, USA (2010)
Google Scholar
Zhang D., Li W J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence, pp. 2177–2183. AAAI Press, Québec City, Québec, Canada (2014)
Google Scholar
Yang, E., Deng, C., Liu, W., Liu, X., Tao, D., Gao, X.: Pairwise relationship guided deep hashing for cross-modal retrieval. In: Proceedings of the 31th AAAI Conference on Artificial Intelligence, pp. 1618–1625. AAAI Press, San Francisco, California, USA (2017)
Google Scholar
Lv, Y., Ng, W., Zeng, Z., Yeung, D.S., Chan, P.: Asymmetric cyclical hashing for large scale image retrieval. IEEE Trans. Multimedia 17(8), 1225–1235 (2015)
Article Google Scholar
Wang, X., Zou, X., Bakker, E.M., Wu, S.: Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval. Neurocomputing 400, 255–271 (2020)
Article Google Scholar

Download references

Acknowledgement

This study was supported by the National Natural Science Foundation of China (61911540482 and 61702324).

Author information

Authors and Affiliations

College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China
Zhiyou Wang, Meijing Li & Tianjie Chen

Authors

Zhiyou Wang
View author publications
You can also search for this author in PubMed Google Scholar
Meijing Li
View author publications
You can also search for this author in PubMed Google Scholar
Tianjie Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meijing Li .

Editor information

Editors and Affiliations

College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong, China
Shu-Chuan Chu
Western Norway University of Applied Sciences, Bergen, Norway
Jerry Chun-Wei Lin
Northeast Electric Power University, Jilin City, Jilin, China
Jianpo Li
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong, China
Jeng-Shyang Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Z., Li, M., Chen, T. (2022). Multi-attention and Multi-layer Hashing for Cross-Modal Retrieval. In: Chu, SC., Lin, J.CW., Li, J., Pan, JS. (eds) Genetic and Evolutionary Computing. ICGEC 2021. Lecture Notes in Electrical Engineering, vol 833. Springer, Singapore. https://doi.org/10.1007/978-981-16-8430-2_13

Download citation

DOI: https://doi.org/10.1007/978-981-16-8430-2_13
Published: 04 January 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8429-6
Online ISBN: 978-981-16-8430-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Multi-attention and Multi-layer Hashing for Cross-Modal Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-attention based semantic deep hashing for cross-modal retrieval

Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval

Deep semantic hashing with dual attention for cross-modal retrieval

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multi-attention and Multi-layer Hashing for Cross-Modal Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-attention based semantic deep hashing for cross-modal retrieval

Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval

Deep semantic hashing with dual attention for cross-modal retrieval

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation