Skip to main content

Overview of the ImageCLEF 2024: Multimedia Retrieval in Medical Applications

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2024)

Abstract

This paper presents an overview of the ImageCLEF 2024 lab, organized as part of the Conference and Labs of the Evaluation Forum – CLEF Labs 2024. ImageCLEF, an ongoing evaluation event since 2003, encourages the evaluation of technologies for annotation, indexing and retrieval of multimodal data. The goal is to provide information access to large collections of data across various usage scenarios and domains. In 2024, the 22st edition of ImageCLEF runs three main tasks: (i) a medical task, continuing the caption analysis, Visual Question Answering for colonoscopy images alongside GANs for medical images, and medical dialogue summarization; (ii) a novel task related to image retrieval/generation for arguments for visual communication, aimed at augmenting the effectiveness of arguments; and (iii)ToPicto, a new task focused on translating natural language, whether spoken or textual, into a sequence of pictograms. The benchmarking capaign was a real success and received the participation of over 35 groups submitting more than 220 runs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 59.99
Price excludes VAT (USA)
Softcover Book
USD 74.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.imageclef.org/.

  2. 2.

    https://scholar.google.com/.

  3. 3.

    Source: Sweating fighter is punched in the face - gettyimages.

  4. 4.

    https://www.imageclef.org/2024/.

  5. 5.

    https://ai4media-bench.aimultimedialab.ro/.

  6. 6.

    https://github.com/AIMultimediaLab/Ai4media-Bench.

  7. 7.

    https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/.

  8. 8.

    https://github.com/wyim/MEDIQA-MAGIC-2024.

  9. 9.

    https://arasaac.org/.

  10. 10.

    https://huggingface.co/Helsinki-NLP/opus-mt-ROMANCE-en.

References

  1. Gemini models (2024). https://ai.google.dev/gemini-api/docs/models/gemini. Accessed 24 Apr 2024

  2. André, V., Canut, E.: Mise à disposition de corpus oraux interactifs: le projet tcof (traitement de corpus oraux en français). Pratiques. Linguistique, littérature, didactique (147-148), 35–51 (2010)

    Google Scholar 

  3. Andrei, A., Radzhabov, A., Coman, I., Kovalev, V., Ionescu, B., Müller, H.: Overview of ImageCLEFmedical GANs 2023 task – identifying training data “fingerprints” in synthetic biomedical images generated by GANs for medical image security. In: CLEF2023 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Thessaloniki, Greece, 18–21 September 2023

    Google Scholar 

  4. Andrei, A., et al.: Overview of 2024 ImageCLEFmedical GANs task – investigating generative models’ impact on biomedical synthetic images. In: CLEF2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, 9–12 September 2024

    Google Scholar 

  5. Banerjee, S., Lavie, A.: Meteor: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72. Association for Computational Linguistics, Ann Arbor, Michigan, June 2005. https://aclanthology.org/W05-0909

  6. Banerjee, S., Lavie, A.: Meteor: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)

    Google Scholar 

  7. Ben Abacha, A., Mrabet, Y., Zhang, Y., Shivade, C., Langlotz, C.P., Demner-Fushman, D.: Overview of the MEDIQA 2021 shared task on summarization in the medical domain. In: Proceedings of the 20th Workshop on Biomedical Language Processing, BioNLP@NAACL-HLT 2021, Online, 11 June 2021, pp. 74–85. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.bionlp-1.8

  8. Ben Abacha, A., Wai Yim, W., Adams, G., Snider, N., Yetisgen, M.: Overview of the MEDIQA-chat 2023 shared tasks on the summarization and generation of doctor-patient conversations. In: ACL-ClinicalNLP 2023 (2023)

    Google Scholar 

  9. Ben Abacha, A., et al.: Overview of the MEDIQA-M3G 2024 shared tasks on multilingual multimodal medical answer generation. In: NAACL-ClinicalNLP 2024 (2024)

    Google Scholar 

  10. Ben Abacha, A., Yim, W., Michalopoulos, G., Lin, T.: An investigation of evaluation methods in automatic medical note generation. In: Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.) Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, pp. 2575–2588. Association for Computational Linguistics, July 2023. https://doi.org/10.18653/v1/2023.findings-acl.161. https://aclanthology.org/2023.findings-acl.161

  11. Bérard, A., Besacier, L., Kocabiyikoglu, A.C., Pietquin, O.: End-to-end automatic speech translation of audiobooks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6224–6228. IEEE (2018)

    Google Scholar 

  12. Bodenreider, O.: The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(Database-Issue), 267–270 (2004). https://doi.org/10.1093/nar/gkh061

  13. Borgli, H., et al.: Hyperkvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci. Data 7(1) (2020). https://doi.org/10.1038/s41597-020-00622-y

  14. Carmo, D., Rittner, L., Lotufo, R.: VisualT5: multitasking caption and concept prediction with pre-trained ViT, T5 and customized spatial attention in radiological images. In: CLEF2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, 9–12 September 2024

    Google Scholar 

  15. Cataix-Nègre, E.: Communiquer autrement: Accompagner les personnes avec des troubles de la parole ou du langage. De Boeck Superieur (2017)

    Google Scholar 

  16. Chaychuk, M.: MMCP team at ImageCLEFmed 2024 task on image synthesis: diffusion models for text-to-image generation of colonoscopy images. In: CLEF2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, September 2024

    Google Scholar 

  17. Galley, M., et al.: deltaBLEU: a discriminative metric for generation tasks with intrinsically diverse targets. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 445–450. Association for Computational Linguistics, Beijing, China, July 2015

    Google Scholar 

  18. García Seco de Herrera, A., Schaer, R., Bromuri, S., Müller, H.: Overview of the ImageCLEF 2016 medical task. In: Working Notes of CLEF 2016 (Cross Language Evaluation Forum), September 2016

    Google Scholar 

  19. Hessel, J., Holtzman, A., Forbes, M., Bras, R.L., Choi, Y.: Clipscore: a reference-free evaluation metric for image captioning. In: Moens, M., Huang, X., Specia, L., Yih, S.W. (eds.) Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7–11 November 2021, pp. 7514–7528. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.595. https://doi.org/10.18653/v1/2021.emnlp-main.595

  20. Hicks, S.A., Storås, A., Halvorsen, P., de Lange, T., Riegler, M.A., Thambawita, V.: Overview of ImageCLEFmedical 2023 - medical visual question answering for gastrointestinal tract. In: CLEF2023 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Thessaloniki, Greece, September 2023

    Google Scholar 

  21. Hicks, S.A., Storås, A., Halvorsen, P., Riegler, M.A., Thambawita, V.: Overview of ImageCLEFmedical 2024 - medical visual question answering for gastrointestinal tract. In: CLEF2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, September 2024

    Google Scholar 

  22. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  23. Ionescu, B., et al.: Advancing multimedia retrieval in medical, social media and content recommendation applications with ImageCLEF 2024. In: Goharian, N., et al. (eds.) ECIR 2024. LNCS, pp. 44–52. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-56072-9_6

    Chapter  Google Scholar 

  24. Ionescu, B., et al.: Overview of ImageCLEF 2023: multimedia retrieval in medical, socialmedia and recommender systems applications. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the 14th International Conference of the CLEF Association (CLEF 2023). Springer Lecture Notes in Computer Science LNCS, Thessaloniki, Greece, 18–21 September 2023

    Google Scholar 

  25. Ionescu, B., et al.: ImageCLEF 2019: multimedia retrieval in medicine, lifelogging, security and nature. In: Crestani, F., et al. (eds.) CLEF 2019. LNCS, vol. 11696, pp. 358–386. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28577-7_28

    Chapter  Google Scholar 

  26. Jha, D., et al.: Kvasir-instrument: diagnostic and therapeutic tool segmentation dataset in gastrointestinal endoscopy. In: Lokoč, J., et al. (eds.) MMM 2021. LNCS, vol. 12573, pp. 218–229. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67835-7_19

    Chapter  Google Scholar 

  27. Kiesel, J., et al.: Overview of touché 2024: argumentation systems. In: Goeuriot, L., et al. (eds.) Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024). Lecture Notes in Computer Science, Springer, Berlin (2024)

    Google Scholar 

  28. Macaire, C., et al.: A multimodal French corpus of aligned speech, text, and pictogram sequences for speech-to-pictogram machine translation. In: Calzolari, N., Kan, M.Y., Hoste, V., Lenci, A., Sakti, S., Xue, N. (eds.) Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 839–849. ELRA and ICCL, Torino, Italia, May 2024. https://aclanthology.org/2024.lrec-main.76

  29. Macaire, C., Esperança-Rodier, E., Lecouteux, B., Schwab, D.: Overview of ImageCLEFToPicto 2024 – investigating the translation of natural language into pictograms. In: CLEF2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, 9–12 September 2024

    Google Scholar 

  30. Martin, L., et al.: CamemBERT: a tasty French language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7203–7219. Association for Computational Linguistics, Online, July 2020. https://www.aclweb.org/anthology/2020.acl-main.645

  31. Oluwafemi Ojonugwa, E.P., Rahman, M., Khalifa, F.: Advancing AI-powered medical image synthesis: insights from MEDVQA-GI challenge using clip, fine-tuned stable diffusion, and dream-booth + LoRA. In: CLEF2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, September 2024

    Google Scholar 

  32. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318. Association for Computational Linguistics, July 2002. https://doi.org/10.3115/1073083.1073135. https://aclanthology.org/P02-1040

  33. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  34. Pelka, O., Friedrich, C.M., García Seco de Herrera, A., Müller, H.: Overview of the ImageCLEFmed 2020 concept prediction task: medical image understanding. In: CLEF2020 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Thessaloniki, Greece, 22–25 September 2020

    Google Scholar 

  35. Popescu, A., Deshayes-Chossart, J., Schindler, H., Ionescu, B.: Overview of the ImageCLEF 2022 aware task. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the 13th International Conference of the CLEF Association (CLEF 2022), Bologna, Italy. LNCS. Springer, Cham, 5–8 September 2022

    Google Scholar 

  36. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)

    Google Scholar 

  37. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020). http://jmlr.org/papers/v21/20-074.html

  38. Roberts, R.J.: PubMed central: the GenBank of the published literature. Proc. Natl. Acad. Sci. U.S.A. 98(2), 381–382 (2001). https://doi.org/10.1073/pnas.98.2.381

    Article  Google Scholar 

  39. Romski, M., Sevcik, R.A.: Augmentative communication and early intervention: myths and realities. Infants Young Child. 18(3), 174–185 (2005)

    Article  Google Scholar 

  40. Rückert, J., et al.: Overview of ImageCLEFmedical 2024 – caption prediction and concept detection. In: CLEF2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, 9–12 September 2024

    Google Scholar 

  41. Rückert, J., et al.: Overview of ImageCLEFmedical 2023 – caption prediction and concept detection. In: CLEF2023 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Thessaloniki, Greece, 18–21 September 2023

    Google Scholar 

  42. Rückert, J., et al.: ROCOv2: radiology objects in context version 2, an updated multimodal image dataset. Sci. Data (2024). https://doi.org/10.1038/s41597-024-03496-6. https://arxiv.org/abs/2405.10004v1

  43. Sellam, T., Das, D., Parikh, A.P.: BLEURT: learning robust metrics for text generation. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, 5–10 July 2020, pp. 7881–7892. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.704. https://doi.org/10.18653/v1/2020.acl-main.704

  44. Ştefan, L.D., Constantin, M.G., Dogariu, M., Ionescu, B.: Overview of ImageCLEFfusion 2023 task - testing ensembling methods in diverse scenarios. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction. CEUR Workshop Proceedings, CEUR-WS.org, Thessaloniki, Greece, 18–21 September 2023

    Google Scholar 

  45. Tsikrika, T., de Herrera, A.G.S., Müller, H.: Assessing the scholarly impact of ImageCLEF. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds.) CLEF 2011. LNCS, vol. 6941, pp. 95–106. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23708-9_12

    Chapter  Google Scholar 

  46. Tsikrika, T., Larsen, B., Müller, H., Endrullis, S., Rahm, E.: The scholarly impact of CLEF (2000–2009). In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 1–12. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40802-1_1

    Chapter  Google Scholar 

  47. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

  48. Vedantam, R., Zitnick, C.L., Parikh, D.: Cider: consensus-based image description evaluation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 4566–4575. IEEE Computer Society (2015). https://doi.org/10.1109/CVPR.2015.7299087

  49. Woodard, J., Nelson, J.: An information theoretic measure of speech recognition performance. In: Workshop on Standardisation for Speech I/O Technology, Naval Air Development Center, Warminster, PA (1982)

    Google Scholar 

  50. Yim, W., Ben Abacha, A., Fu, Y., Sun, Z., Yetisgen, M., Xia, F.: Overview of the MEDIQA-magic task at ImageCLEF 2024: multimodal and generative telemedicine in dermatology. In: CLEF 2024 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Grenoble, France, 9–12 September 2024

    Google Scholar 

  51. Yim, W., Ben Abacha, A., Snider, N., Adams, G., Yetisgen, M.: Overview of the MEDIQA-sum task at ImageCLEF 2023: summarization and classification of doctor-patient conversations. In: CLEF 2023 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, Thessaloniki, Greece, 18–21 September 2023

    Google Scholar 

  52. Yim, W., Fu, Y., Sun, Z., Ben Abacha, A., Yetisgen, M., Xia, F.: DermaVQA: a multilingual visual question answering dataset for dermatology. CoRR (2024)

    Google Scholar 

  53. Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTscore: evaluating text generation with BERT. arXiv abs/1904.09675 (2019)

    Google Scholar 

  54. Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTscore: evaluating text generation with BERT. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=SkeHuCVFDr

Download references

Acknowledgements

The lab is supported under the H2020 AI4Media “A European Excellence Centre for Media, Society and Democracy” project, contract \(\#951911\), as well as the ImageCLEFmedical GANs tasks. The work of Louise Bloch, Raphael Brüngel and Benjamin Bracke was partially funded by a PhD grant from the University of Applied Sciences and Arts Dortmund (FH Dortmund), Germany. The work of Ahmad Idrissi-Yaghir, Tabea M. G. Pakull, Hendrik Damm and Henning Schäfer was funded by a PhD grant from the DFG Research Training Group 2535 Knowledge- and data-based personalisation of medicine at the point of care (WisPerMed). The ToPicto task was funded by the Agence Nationale de la Recherche (ANR) through the project PROPICTO (ANR-20-CE93-0005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bogdan Ionescu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ionescu, B. et al. (2024). Overview of the ImageCLEF 2024: Multimedia Retrieval in Medical Applications. In: Goeuriot, L., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2024. Lecture Notes in Computer Science, vol 14959. Springer, Cham. https://doi.org/10.1007/978-3-031-71908-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-71908-0_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-71907-3

  • Online ISBN: 978-3-031-71908-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics