A VQA System for Medical Image Classification Using Transfer Learning

Dhanush, C.; Kumar, D. Pradeep; Kanavalli, Anita

doi:10.1007/978-981-16-0171-2_24

C. Dhanush¹⁸,
D. Pradeep Kumar¹⁸ &
Anita Kanavalli¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1407))

534 Accesses
1 Citations

Abstract

The Visual Question Answering (VQA) is based on Computer Vision and Natural Language Processing (NLP). The goal of VQA system is to predict the textual answer to a question based on the image. The VQA system takes images and questions as input and combines the information of the input to generate readable answers as output. The medical VQA system has the following advantages. The radiologists can use the VQA system for their inference about the medical image. It can also help the patients to get basic information about the clinical image prior to doctor consultation. In this paper, we discuss about the VQA on medical images using the ImageCLEF 2019 VQA dataset. We build a medical VQA system using transfer learning on radiology images using MobileNet for input images on the plane class and predict the answer. The proposed VQA model is evaluated on the test dataset and the accuracy obtained is 80.8% on the plane class.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Localized Questions in Medical Visual Question Answering

Multimodal fusion: advancing medical visual question-answering

Article 20 August 2024

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

References

X. Li, Y. Shi, Computer vision imaging based on artificial intelligence, in 2018 International Conference on Virtual Reality and Intelligent Systems (ICVRIS), Changsha (2018), pp. 22–25
Google Scholar
M.C. Surabhi, Natural language processing future, in 2013 International Conference on Optical Imaging Sensor and Security (ICOSS), Coimbatore (2013), pp. 1–3. https://doi.org/10.1109/ICOISS.2013.6678407
K. Kafle, C. Kanan, Visual question answering: datasets, algorithms, and future challenges. Comput. Vis. Image Underst. 3–20 (2017)
Google Scholar
A. Lubna, S. Kalady, A. Lijiya, MoBVQA: a modality based medical image visual question answering system, in TENCON 2019—2019 IEEE Region 10 Conference (TENCON), Kochi, India (2019), pp. 727–732. https://doi.org/10.1109/TENCON.2019.8929456
A. Ben Abacha, S.A. Hasan, V.V. Datla, J. Liu, D. Demner-Fushman, H. Müller, VQA-Med: overview of the medical visual question answering task at ImageCLEF 2019, in CLEF (2019)
Google Scholar
F. Liu, Y. Peng, M.P. Rosen, An effective deep transfer learning and information fusion framework for medical visual question answering, in Experimental IR meets multilinguality, multimodality, and interaction, CLEF ed. by F. Crestani et al. (2019)
Google Scholar
B. Khasoggi, E. Ermatita, S. Sahmin, Efficient mobilenet architecture as image recognition on mobile and embedded devices. Indonesian J. Electr. Eng. Comput. Sci. 16, 389–394 (2019). https://doi.org/10.11591/ijeecs.v16.i1
R. Joshi, Accuracy, precision, recall and F1 score: “interpretation of performance measures” (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Ramaiah Institute of Technology, Bangalore, Karnataka, India
C. Dhanush, D. Pradeep Kumar & Anita Kanavalli

Authors

C. Dhanush
View author publications
You can also search for this author in PubMed Google Scholar
D. Pradeep Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Anita Kanavalli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, Uttar Pradesh, India
Vikrant Bhateja
School of Computer Engineering, Kalinga Institute of Industrial Technology, Bhubaneswar, India
Suresh Chandra Satapathy
Department of Signals and Communications, Institute for Technological Development, Las Palmas de Gran Canaria, Spain
Carlos M. Travieso-González
Department of Computer Applications, JSS Science and Technology University, Mysuru, India
V. N. Manjunath Aradhya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dhanush, C., Kumar, D.P., Kanavalli, A. (2021). A VQA System for Medical Image Classification Using Transfer Learning. In: Bhateja, V., Satapathy, S.C., Travieso-González, C.M., Aradhya, V.N.M. (eds) Data Engineering and Intelligent Computing. Advances in Intelligent Systems and Computing, vol 1407. Springer, Singapore. https://doi.org/10.1007/978-981-16-0171-2_24

Download citation

DOI: https://doi.org/10.1007/978-981-16-0171-2_24
Published: 05 May 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0170-5
Online ISBN: 978-981-16-0171-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A VQA System for Medical Image Classification Using Transfer Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Localized Questions in Medical Visual Question Answering

Multimodal fusion: advancing medical visual question-answering

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A VQA System for Medical Image Classification Using Transfer Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Localized Questions in Medical Visual Question Answering

Multimodal fusion: advancing medical visual question-answering

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation