Abstract
The Visual Question Answering (VQA) is based on Computer Vision and Natural Language Processing (NLP). The goal of VQA system is to predict the textual answer to a question based on the image. The VQA system takes images and questions as input and combines the information of the input to generate readable answers as output. The medical VQA system has the following advantages. The radiologists can use the VQA system for their inference about the medical image. It can also help the patients to get basic information about the clinical image prior to doctor consultation. In this paper, we discuss about the VQA on medical images using the ImageCLEF 2019 VQA dataset. We build a medical VQA system using transfer learning on radiology images using MobileNet for input images on the plane class and predict the answer. The proposed VQA model is evaluated on the test dataset and the accuracy obtained is 80.8% on the plane class.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
X. Li, Y. Shi, Computer vision imaging based on artificial intelligence, in 2018 International Conference on Virtual Reality and Intelligent Systems (ICVRIS), Changsha (2018), pp. 22–25
M.C. Surabhi, Natural language processing future, in 2013 International Conference on Optical Imaging Sensor and Security (ICOSS), Coimbatore (2013), pp. 1–3. https://doi.org/10.1109/ICOISS.2013.6678407
K. Kafle, C. Kanan, Visual question answering: datasets, algorithms, and future challenges. Comput. Vis. Image Underst. 3–20 (2017)
A. Lubna, S. Kalady, A. Lijiya, MoBVQA: a modality based medical image visual question answering system, in TENCON 2019—2019 IEEE Region 10 Conference (TENCON), Kochi, India (2019), pp. 727–732. https://doi.org/10.1109/TENCON.2019.8929456
A. Ben Abacha, S.A. Hasan, V.V. Datla, J. Liu, D. Demner-Fushman, H. Müller, VQA-Med: overview of the medical visual question answering task at ImageCLEF 2019, in CLEF (2019)
F. Liu, Y. Peng, M.P. Rosen, An effective deep transfer learning and information fusion framework for medical visual question answering, in Experimental IR meets multilinguality, multimodality, and interaction, CLEF ed. by F. Crestani et al. (2019)
B. Khasoggi, E. Ermatita, S. Sahmin, Efficient mobilenet architecture as image recognition on mobile and embedded devices. Indonesian J. Electr. Eng. Comput. Sci. 16, 389–394 (2019). https://doi.org/10.11591/ijeecs.v16.i1
R. Joshi, Accuracy, precision, recall and F1 score: “interpretation of performance measures” (2016)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Dhanush, C., Kumar, D.P., Kanavalli, A. (2021). A VQA System for Medical Image Classification Using Transfer Learning. In: Bhateja, V., Satapathy, S.C., Travieso-González, C.M., Aradhya, V.N.M. (eds) Data Engineering and Intelligent Computing. Advances in Intelligent Systems and Computing, vol 1407. Springer, Singapore. https://doi.org/10.1007/978-981-16-0171-2_24
Download citation
DOI: https://doi.org/10.1007/978-981-16-0171-2_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0170-5
Online ISBN: 978-981-16-0171-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)