Skip to main content

Showing 1–11 of 11 results for author: Krantz, J

  1. arXiv:2304.01192  [pdf, other

    cs.CV cs.RO

    Navigating to Objects Specified by Images

    Authors: Jacob Krantz, Theophile Gervet, Karmesh Yadav, Austin Wang, Chris Paxton, Roozbeh Mottaghi, Dhruv Batra, Jitendra Malik, Stefan Lee, Devendra Singh Chaplot

    Abstract: Images are a convenient way to specify which particular object instance an embodied agent should navigate to. Solving this task requires semantic visual reasoning and exploration of unknown environments. We present a system that can perform this task in both simulation and the real world. Our modular method solves sub-tasks of exploration, goal instance re-identification, goal localization, and lo… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  2. arXiv:2211.15876  [pdf, other

    cs.CV

    Instance-Specific Image Goal Navigation: Training Embodied Agents to Find Object Instances

    Authors: Jacob Krantz, Stefan Lee, Jitendra Malik, Dhruv Batra, Devendra Singh Chaplot

    Abstract: We consider the problem of embodied visual navigation given an image-goal (ImageNav) where an agent is initialized in an unfamiliar environment and tasked with navigating to a location 'described' by an image. Unlike related navigation tasks, ImageNav does not have a standardized task definition which makes comparison across methods difficult. Further, existing formulations have two problematic pr… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  3. arXiv:2210.06849  [pdf, other

    cs.CV

    Retrospectives on the Embodied AI Workshop

    Authors: Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi , et al. (14 additional authors not shown)

    Abstract: We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) embodied vision-and-language. We discuss the dominant datasets within each theme, evaluation metrics for the challenges, and the performance of state-of… ▽ More

    Submitted 4 December, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

  4. arXiv:2210.03087  [pdf, other

    cs.CV cs.CL cs.RO

    Iterative Vision-and-Language Navigation

    Authors: Jacob Krantz, Shurjo Banerjee, Wang Zhu, Jason Corso, Peter Anderson, Stefan Lee, Jesse Thomason

    Abstract: We present Iterative Vision-and-Language Navigation (IVLN), a paradigm for evaluating language-guided agents navigating in a persistent environment over time. Existing Vision-and-Language Navigation (VLN) benchmarks erase the agent's memory at the beginning of every episode, testing the ability to perform cold-start navigation with no prior information. However, deployed robots occupy the same env… ▽ More

    Submitted 24 December, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: Accepted by CVPR 2023

  5. arXiv:2204.09667  [pdf, ps, other

    cs.CV cs.CL cs.RO

    Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments

    Authors: Jacob Krantz, Stefan Lee

    Abstract: Recent work in Vision-and-Language Navigation (VLN) has presented two environmental paradigms with differing realism -- the standard VLN setting built on topological environments where navigation is abstracted away, and the VLN-CE setting where agents must navigate continuous 3D environments using low-level actions. Despite sharing the high-level task and even the underlying instruction-path data,… ▽ More

    Submitted 24 April, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: Changes: figure compression for accessibility

  6. arXiv:2110.02207  [pdf, other

    cs.CV cs.CL cs.RO

    Waypoint Models for Instruction-guided Navigation in Continuous Environments

    Authors: Jacob Krantz, Aaron Gokaslan, Dhruv Batra, Stefan Lee, Oleksandr Maksymets

    Abstract: Little inquiry has explicitly addressed the role of action spaces in language-guided visual navigation -- either in terms of its effect on navigation success or the efficiency with which a robotic agent could execute the resulting trajectory. Building on the recently released VLN-CE setting for instruction following in continuous environments, we develop a class of language-conditioned waypoint pr… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: ICCV 2021

  7. arXiv:2011.08277  [pdf, other

    cs.CV cs.CL

    Where Are You? Localization from Embodied Dialog

    Authors: Meera Hahn, Jacob Krantz, Dhruv Batra, Devi Parikh, James M. Rehg, Stefan Lee, Peter Anderson

    Abstract: We present Where Are You? (WAY), a dataset of ~6k dialogs in which two humans -- an Observer and a Locator -- complete a cooperative localization task. The Observer is spawned at random in a 3D environment and can navigate from first-person views while answering questions from the Locator. The Locator must localize the Observer in a detailed top-down map by asking questions and giving instructions… ▽ More

    Submitted 3 September, 2021; v1 submitted 16 November, 2020; originally announced November 2020.

    Journal ref: EMNLP 2020

  8. arXiv:2004.02857  [pdf, other

    cs.CV cs.CL cs.RO

    Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments

    Authors: Jacob Krantz, Erik Wijmans, Arjun Majumdar, Dhruv Batra, Stefan Lee

    Abstract: We develop a language-guided navigation task set in a continuous 3D environment where agents must execute low-level actions to follow natural language navigation directions. By being situated in continuous environments, this setting lifts a number of assumptions implicit in prior work that represents environments as a sparse graph of panoramas with edges corresponding to navigability. Specifically… ▽ More

    Submitted 1 May, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

  9. arXiv:1909.13362  [pdf, other

    cs.CL

    Language-Agnostic Syllabification with Neural Sequence Labeling

    Authors: Jacob Krantz, Maxwell Dulin, Paul De Palma

    Abstract: The identification of syllables within phonetic sequences is known as syllabification. This task is thought to play an important role in natural language understanding, speech production, and the development of speech recognition systems. The concept of the syllable is cross-linguistic, though formal definitions are rarely agreed upon, even within a language. In response, data-driven syllabificati… ▽ More

    Submitted 29 September, 2019; originally announced September 2019.

    Comments: Accepted as full paper for presentation at the 18th IEEE International Conference on Machine Learning and Applications (ICMLA 2019). 7 pages

  10. arXiv:1810.08838  [pdf, other

    cs.CL

    Abstractive Summarization Using Attentive Neural Techniques

    Authors: Jacob Krantz, Jugal Kalita

    Abstract: In a world of proliferating data, the ability to rapidly summarize text is growing in importance. Automatic summarization of text can be thought of as a sequence to sequence problem. Another area of natural language processing that solves a sequence to sequence problem is machine translation, which is rapidly evolving due to the development of attention-based encoder-decoder networks. This work ap… ▽ More

    Submitted 20 October, 2018; originally announced October 2018.

    Comments: Accepted for oral presentation at the 15th International Conference on Natural Language Processing (ICON 2018)

  11. Syllabification by Phone Categorization

    Authors: Jacob Krantz, Maxwell Dulin, Paul De Palma, Mark VanDam

    Abstract: Syllables play an important role in speech synthesis, speech recognition, and spoken document retrieval. A novel, low cost, and language agnostic approach to dividing words into their corresponding syllables is presented. A hybrid genetic algorithm constructs a categorization of phones optimized for syllabification. This categorization is used on top of a hidden Markov model sequence classifier to… ▽ More

    Submitted 15 July, 2018; originally announced July 2018.

    Journal ref: Jacob Krantz, Maxwell Dulin, Paul De Palma, and Mark VanDam. 2018. Syllabification by Phone Categorization. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO '18) 47-48