Autoclip: Auto-tuning zero-shot classifiers for vision-language models

JH Metzen, P Saranrittichai, CK Mummadi�- arXiv preprint arXiv�…, 2023 - arxiv.org
Classifiers built upon vision-language models such as CLIP have shown remarkable zero-
shot performance across a broad range of image classification tasks. Prior work has studied
different ways of automatically creating descriptor sets for every class based on prompt
templates, ranging from manually engineered templates over templates obtained from a
large language model to templates built from random words and characters. Up until now,
deriving zero-shot classifiers from the respective encoded class descriptors has remained�…

AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models

J Hendrik Metzen, P Saranrittichai…�- arXiv e�…, 2023 - ui.adsabs.harvard.edu
Classifiers built upon vision-language models such as CLIP have shown remarkable zero-
shot performance across a broad range of image classification tasks. Prior work has studied
different ways of automatically creating descriptor sets for every class based on prompt
templates, ranging from manually engineered templates over templates obtained from a
large language model to templates built from random words and characters. Up until now,
deriving zero-shot classifiers from the respective encoded class descriptors has remained�…
Showing the best results for this search. See all results