Skip to main content

Showing 1–1 of 1 results for author: Rintamaki, T

  1. arXiv:2409.11402  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MM

    NVLM: Open Frontier-Class Multimodal LLMs

    Authors: Wenliang Dai, Nayeon Lee, Boxin Wang, Zhuoling Yang, Zihan Liu, Jon Barker, Tuomas Rintamaki, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping

    Abstract: We introduce NVLM 1.0, a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2). Remarkably, NVLM 1.0 shows improved text-only performance over its LLM backbone after multimodal training. In terms of model desi… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.