Google
Nov 26, 2023We propose a novel visual programming approach for zero-shot open-vocabulary 3DVG, leveraging the capabilities of large language models (LLMs).
Zero-shot 3DVG identifies the location of target objects using programmatic representation generated by LLMs, ie, target category, anchor category, and�...
3D Visual Grounding (3DVG) aims to localize specific ob- jects within 3D scenes by using a series of textual descrip- tions. This has become a crucial component�...
We propose a novel visual programming approach for zero-shot open-vocabulary 3DVG, leveraging the capabilities of large language models (LLMs).
3D Visual Grounding (3DVG) aims at localizing 3D object based on textual descriptions. Conventional supervised methods for 3DVG often necessitate extensive�...
3D Visual Grounding (3DVG) aims to localize specific ob- jects within 3D scenes by using a series of textual descrip- tions. This has become a crucial component�...
We propose a novel visual programming approach for zero-shot open-vocabulary 3DVG, leveraging the capabilities of large language models (LLMs).
Answer: Based on the description, we are looking for a storage shelf that is white in color and is above a desk with a chair in front.
Sep 24, 2024An LLM is then used to reason which object satisfies the grounding relationship. ZS3DVG [3] follows a similar pipeline but requires the LLM to�...
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding. Z. Yuan, J. Ren, C. Feng, H. Zhao, S. Cui, and Z. Li. CoRR, (2023 ).