You are here
VQA-Med: Overview of the medical visual question answering task at imageclef 2019.
This paper presents an overview of the Medical Visual Question Answering task (VQA-Med) at ImageCLEF 2019. Participating systems were tasked with answering medical questions based on the visual content of radiology images. In this second edition of VQA-Med, we focused on four categories of clinical questions: Modality, Plane, Organ System, and Abnormality. These categories are designed with different degrees of difficulty leveraging both classification and text generation approaches. We also ensured that all questions can be answered from the image content without requiring additional medical knowledge or domain-specific inference. We created a new dataset of 4,200 radiology images and 15,292 question-answer pairs following these guidelines. The challenge was well received with 17 participating teams who applied a wide range of approaches such as transfer learning, multi-task learning, and ensemble methods. The best team achieved a BLEU score of 64.4% and an accuracy of 62.4%. In future editions, we will consider designing more goal-oriented datasets and tackling new aspects such as contextual information and domain-specific inference.