Object-centric unsupervised image captioning

Z Meng, D Yang, X Cao, A Shah, SN Lim�- European Conference on�…, 2022 - Springer
Image captioning is a longstanding problem in the field of computer vision and natural
language processing. To date, researchers have produced impressive state-of-the-art
performance in the age of deep learning. Most of these state-of-the-art, however, requires
large volume of annotated image-caption pairs in order to train their models. When given an
image dataset of interests, practitioner needs to annotate the caption for each image in the
training set and this process needs to happen for each newly collected image dataset. In this�…

[PDF][PDF] Object-Centric Unsupervised Image Captioning: Supplements

Z Meng, D Yang, X Cao, A Shah, SN Lim - ecva.net
Table 1 (borrowed from [4]) shows the statistics of COCO captions [1] and Localized
Narratives [4]. We can see that Localized Narratives contain more comprehensive and
semantically meaningful sentences. This is why we choose LN-COCO as the ground truth for
evaluation. We refer readers to the website of [4] for more qualitative examples of the
captions.
Showing the best results for this search. See all results