Skip to main content

Showing 1–30 of 30 results for author: Kadambi, A

  1. arXiv:2407.16902  [pdf, other

    cs.CY cs.AI

    The Potential and Perils of Generative Artificial Intelligence for Quality Improvement and Patient Safety

    Authors: Laleh Jalilian, Daniel McDuff, Achuta Kadambi

    Abstract: Generative artificial intelligence (GenAI) has the potential to improve healthcare through automation that enhances the quality and safety of patient care. Powered by foundation models that have been pretrained and can generate complex content, GenAI represents a paradigm shift away from the more traditional focus on task-specific classifiers that have dominated the AI landscape thus far. We posit… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

  2. arXiv:2407.11936  [pdf, other

    cs.CV

    Thermal Imaging and Radar for Remote Sleep Monitoring of Breathing and Apnea

    Authors: Kai Del Regno, Alexander Vilesov, Adnan Armouti, Anirudh Bindiganavale Harish, Selim Emir Can, Ashley Kita, Achuta Kadambi

    Abstract: Polysomnography (PSG), the current gold standard method for monitoring and detecting sleep disorders, is cumbersome and costly. At-home testing solutions, known as home sleep apnea testing (HSAT), exist. However, they are contact-based, a feature which limits the ability of some patient populations to tolerate testing and discourages widespread deployment. Previous work on non-contact sleep monito… ▽ More

    Submitted 7 August, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

  3. arXiv:2407.04169  [pdf, other

    cs.CV cs.CR

    Solutions to Deepfakes: Can Camera Hardware, Cryptography, and Deep Learning Verify Real Images?

    Authors: Alexander Vilesov, Yuan Tian, Nader Sehatbakhsh, Achuta Kadambi

    Abstract: The exponential progress in generative AI poses serious implications for the credibility of all real images and videos. There will exist a point in the future where 1) digital content produced by generative AI will be indistinguishable from those created by cameras, 2) high-quality generative algorithms will be accessible to anyone, and 3) the ratio of all synthetic to real images will be large. I… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  4. arXiv:2406.13527  [pdf, other

    cs.CV

    4K4DGen: Panoramic 4D Generation at 4K Resolution

    Authors: Renjie Li, Panwang Pan, Bangbang Yang, Dejia Xu, Shijie Zhou, Xuanyang Zhang, Zeming Li, Achuta Kadambi, Zhangyang Wang, Zhengzhong Tu, Zhiwen Fan

    Abstract: The blooming of virtual reality and augmented reality (VR/AR) technologies has driven an increasing demand for the creation of high-quality, immersive, and dynamic environments. However, existing generative techniques either focus solely on dynamic objects or perform outpainting from a single perspective image, failing to meet the requirements of VR/AR applications that need free-viewpoint, 360… ▽ More

    Submitted 3 October, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2405.17315  [pdf, other

    cs.CV

    All-day Depth Completion

    Authors: Vadim Ezhov, Hyoungseob Park, Zhaoyang Zhang, Rishi Upadhyay, Howard Zhang, Chethan Chinder Chandrappa, Achuta Kadambi, Yunhao Ba, Julie Dorsey, Alex Wong

    Abstract: We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 8 pages, 4 figures

  6. arXiv:2404.06903  [pdf, other

    cs.CV cs.AI

    DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

    Authors: Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi

    Abstract: The increasing demand for virtual reality applications has highlighted the significance of crafting immersive 3D assets. We present a text-to-3D 360$^{\circ}$ scene generation pipeline that facilitates the creation of comprehensive 360$^{\circ}$ scenes for in-the-wild environments in a matter of minutes. Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement… ▽ More

    Submitted 25 July, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  7. arXiv:2403.14874  [pdf, other

    cs.CV cs.LG

    WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

    Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Nathan Wei, Matthew Waliman, Yunhao Ba, Celso de Melo, Alex Wong, Achuta Kadambi

    Abstract: We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first… ▽ More

    Submitted 7 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2312.09534

  8. arXiv:2403.12327  [pdf, other

    cs.CV cs.LG

    GT-Rain Single Image Deraining Challenge Report

    Authors: Howard Zhang, Yunhao Ba, Ethan Yang, Rishi Upadhyay, Alex Wong, Achuta Kadambi, Yun Guo, Xueyao Xiao, Xiaoxiong Wang, Yi Li, Yi Chang, Luxin Yan, Chaochao Zheng, Luping Wang, Bin Liu, Sunder Ali Khowaja, Jiseok Yoon, Ik-Hyun Lee, Zhao Zhang, Yanyan Wei, Jiahuan Ren, Suiyi Zhao, Huan Zheng

    Abstract: This report reviews the results of the GT-Rain challenge on single image deraining at the UG2+ workshop at CVPR 2023. The aim of this competition is to study the rainy weather phenomenon in real world scenarios, provide a novel real world rainy image dataset, and to spark innovative ideas that will further the development of single image deraining methods on real images. Submissions were trained o… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  9. arXiv:2312.17234  [pdf, other

    cs.CV

    Personalized Restoration via Dual-Pivot Tuning

    Authors: Pradyumna Chari, Sizhuo Ma, Daniil Ostashev, Achuta Kadambi, Gurunandan Krishnan, Jian Wang, Kfir Aberman

    Abstract: Generative diffusion models can serve as a prior which ensures that solutions of image restoration systems adhere to the manifold of natural images. However, for restoring facial images, a personalized prior is necessary to accurately represent and reconstruct unique facial features of a given individual. In this paper, we propose a simple, yet effective, method for personalized restoration, calle… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  10. arXiv:2312.09534  [pdf, other

    cs.CV

    WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

    Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Matthew Waliman, Yunhao Ba, Alex Wong, Achuta Kadambi

    Abstract: The introduction of large, foundational models to computer vision has led to drastically improved performance on the task of semantic segmentation. However, these existing methods exhibit a large performance drop when testing on images degraded by weather conditions such as rain, fog, or snow. We introduce a general paired-training method that can be applied to all current foundational model archi… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  11. arXiv:2312.04875  [pdf, other

    cs.CV

    MVDD: Multi-View Depth Diffusion Models

    Authors: Zhen Wang, Qiangeng Xu, Feitong Tan, Menglei Chai, Shichen Liu, Rohit Pandey, Sean Fanello, Achuta Kadambi, Yinda Zhang

    Abstract: Denoising diffusion models have demonstrated outstanding results in 2D image generation, yet it remains a challenge to replicate its success in 3D shape generation. In this paper, we propose leveraging multi-view depth, which represents complex 3D shapes in a 2D data format that is easy to denoise. We pair this representation with a diffusion model, MVDD, that is capable of generating high-quality… ▽ More

    Submitted 19 December, 2023; v1 submitted 8 December, 2023; originally announced December 2023.

  12. arXiv:2312.03203  [pdf, other

    cs.CV

    Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields

    Authors: Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, Achuta Kadambi

    Abstract: 3D scene representations have gained immense popularity in recent years. Methods that use Neural Radiance fields are versatile for traditional tasks such as novel view synthesis. In recent times, some work has emerged that aims to extend the functionality of NeRF beyond view synthesis, for semantically aware tasks such as editing and segmentation using 3D feature field distillation from 2D foundat… ▽ More

    Submitted 8 April, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  13. arXiv:2312.00944  [pdf, other

    cs.CV cs.GR

    Enhancing Diffusion Models with 3D Perspective Geometry Constraints

    Authors: Rishi Upadhyay, Howard Zhang, Yunhao Ba, Ethan Yang, Blake Gella, Sicheng Jiang, Alex Wong, Achuta Kadambi

    Abstract: While perspective is a well-studied topic in art, it is generally taken for granted in images. However, for the recent wave of high-quality image synthesis methods such as latent diffusion models, perspective accuracy is not an explicit requirement. Since these methods are capable of outputting a wide gamut of possible images, it is difficult for these synthesized images to adhere to the principle… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Project Webpage: http://visual.ee.ucla.edu/diffusionperspective.htm/

  14. arXiv:2312.00206  [pdf, other

    cs.CV cs.LG eess.IV

    SparseGS: Real-Time 360° Sparse View Synthesis using Gaussian Splatting

    Authors: Haolin Xiong, Sairisheek Muttukuru, Rishi Upadhyay, Pradyumna Chari, Achuta Kadambi

    Abstract: The problem of novel view synthesis has grown significantly in popularity recently with the introduction of Neural Radiance Fields (NeRFs) and other implicit scene representation methods. A recent advance, 3D Gaussian Splatting (3DGS), leverages an explicit representation to achieve real-time rendering with high-quality results. However, 3DGS still requires an abundance of training views to genera… ▽ More

    Submitted 13 May, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

    Comments: This is a revised version which includes multiple new components. Project page: https://github.com/ForMyCat/SparseGS

  15. arXiv:2311.17907  [pdf, other

    cs.CV cs.AI

    CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting

    Authors: Alexander Vilesov, Pradyumna Chari, Achuta Kadambi

    Abstract: With the onset of diffusion-based generative models and their ability to generate text-conditioned images, content generation has received a massive invigoration. Recently, these models have been shown to provide useful guidance for the generation of 3D graphics assets. However, existing work in text-conditioned 3D generation faces fundamental constraints: (i) inability to generate detailed, multi… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  16. arXiv:2304.03243  [pdf, other

    cs.AI cs.LG stat.AP

    Synthetic Data in Healthcare

    Authors: Daniel McDuff, Theodore Curran, Achuta Kadambi

    Abstract: Synthetic data are becoming a critical tool for building artificially intelligent systems. Simulators provide a way of generating data systematically and at scale. These data can then be used either exclusively, or in conjunction with real data, for training and testing systems. Synthetic data are particularly attractive in cases where the availability of ``real'' training examples might be a bott… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  17. arXiv:2212.04096  [pdf, other

    cs.CV

    ALTO: Alternating Latent Topologies for Implicit 3D Reconstruction

    Authors: Zhen Wang, Shijie Zhou, Jeong Joon Park, Despoina Paschalidou, Suya You, Gordon Wetzstein, Leonidas Guibas, Achuta Kadambi

    Abstract: This work introduces alternating latent topologies (ALTO) for high-fidelity reconstruction of implicit 3D surfaces from noisy point clouds. Previous work identifies that the spatial arrangement of latent encodings is important to recover detail. One school of thought is to encode a latent vector for each point (point latents). Another school of thought is to project point latents into a grid (grid… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  18. arXiv:2209.00746  [pdf, other

    cs.LG cs.CV

    MIME: Minority Inclusion for Majority Group Enhancement of AI Performance

    Authors: Pradyumna Chari, Yunhao Ba, Shreeram Athreya, Achuta Kadambi

    Abstract: Several papers have rightly included minority groups in artificial intelligence (AI) training data to improve test inference for minority groups and/or society-at-large. A society-at-large consists of both minority and majority stakeholders. A common misconception is that minority inclusion does not increase performance for majority groups alone. In this paper, we make the surprising finding that… ▽ More

    Submitted 1 September, 2022; originally announced September 2022.

  19. arXiv:2206.10779  [pdf, other

    cs.CV

    Not Just Streaks: Towards Ground Truth for Single Image Deraining

    Authors: Yunhao Ba, Howard Zhang, Ethan Yang, Akira Suzuki, Arnold Pfahnl, Chethan Chinder Chandrappa, Celso de Melo, Suya You, Stefano Soatto, Alex Wong, Achuta Kadambi

    Abstract: We propose a large-scale dataset of real-world rainy and clean image pairs and a method to remove degradations, induced by rain streaks and rain accumulation, from the image. As there exists no real-world dataset for deraining, current state-of-the-art methods rely on synthetic data and thus are limited by the sim2real domain gap; moreover, rigorous evaluation remains a challenge due to the absenc… ▽ More

    Submitted 29 July, 2024; v1 submitted 21 June, 2022; originally announced June 2022.

  20. arXiv:2109.13488  [pdf, other

    cs.CV

    Towards Rotation Invariance in Object Detection

    Authors: Agastya Kalra, Guy Stoppi, Bradley Brown, Rishav Agarwal, Achuta Kadambi

    Abstract: Rotation augmentations generally improve a model's invariance/equivariance to rotation - except in object detection. In object detection the shape is not known, therefore rotation creates a label ambiguity. We show that the de-facto method for bounding box label rotation, the Largest Box Method, creates very large labels, leading to poor performance and in many cases worse performance than using n… ▽ More

    Submitted 30 September, 2021; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: Accepted ICCV 2021

  21. arXiv:2109.05959  [pdf

    cs.ET physics.optics

    Physics-AI Symbiosis

    Authors: Bahram Jalali, Achuta Kadambi, Vwani Roychowdhury

    Abstract: The phenomenal success of physics in explaining nature and designing hardware is predicated on efficient computational models. A universal codebook of physical laws defines the computational rules and a physical system is an interacting ensemble governed by these rules. Led by deep neural networks, artificial intelligence (AI) has introduced an alternate end-to-end data-driven computational framew… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

  22. arXiv:2106.06007  [pdf, other

    cs.CV

    Overcoming Difficulty in Obtaining Dark-skinned Subjects for Remote-PPG by Synthetic Augmentation

    Authors: Yunhao Ba, Zhen Wang, Kerim Doruk Karinca, Oyku Deniz Bozkurt, Achuta Kadambi

    Abstract: Camera-based remote photoplethysmography (rPPG) provides a non-contact way to measure physiological signals (e.g., heart rate) using facial videos. Recent deep learning architectures have improved the accuracy of such physiological measurement significantly, yet they are restricted by the diversity of the annotated videos. The existing datasets MMSE-HR, AFRL, and UBFC-RPPG contain roughly 10%, 0%,… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

  23. arXiv:1911.12906  [pdf

    eess.IV cs.CV

    Enhancing Passive Non-Line-of-Sight Imaging Using Polarization Cues

    Authors: Kenichiro Tanaka, Yasuhiro Mukaigawa, Achuta Kadambi

    Abstract: This paper presents a method of passive non-line-of-sight (NLOS) imaging using polarization cues. A key observation is that the oblique light has a different polarimetric signal. It turns out this effect is due to the polarization axis rotation, a phenomena which can be used to better condition the light transport matrix for non-line-of-sight imaging. Our analysis and results show that the use of… ▽ More

    Submitted 28 November, 2019; originally announced November 2019.

  24. arXiv:1911.11893  [pdf, other

    cs.CV

    Visual Physics: Discovering Physical Laws from Videos

    Authors: Pradyumna Chari, Chinmay Talegaonkar, Yunhao Ba, Achuta Kadambi

    Abstract: In this paper, we teach a machine to discover the laws of physics from video streams. We assume no prior knowledge of physics, beyond a temporal stream of bounding boxes. The problem is very difficult because a machine must learn not only a governing equation (e.g. projectile motion) but also the existence of governing parameters (e.g. velocities). We evaluate our ability to discover physical laws… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

  25. arXiv:1910.00201  [pdf, other

    cs.LG stat.ML

    Blending Diverse Physical Priors with Neural Networks

    Authors: Yunhao Ba, Guangyuan Zhao, Achuta Kadambi

    Abstract: Machine learning in context of physical systems merits a re-examination of the learning strategy. In addition to data, one can leverage a vast library of physical prior models (e.g. kinematics, fluid flow, etc) to perform more robust inference. The nascent sub-field of \emph{physics-based learning} (PBL) studies the blending of neural networks with physical priors. While previous PBL algorithms ha… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

  26. arXiv:1903.10210  [pdf, other

    cs.CV cs.LG

    Deep Shape from Polarization

    Authors: Yunhao Ba, Alex Ross Gilbert, Franklin Wang, Jinfa Yang, Rui Chen, Yiqin Wang, Lei Yan, Boxin Shi, Achuta Kadambi

    Abstract: This paper makes a first attempt to bring the Shape from Polarization (SfP) problem to the realm of deep learning. The previous state-of-the-art methods for SfP have been purely physics-based. We see value in these principled models, and blend these physical models as priors into a neural network architecture. This proposed approach achieves results that exceed the previous state-of-the-art on a c… ▽ More

    Submitted 25 May, 2020; v1 submitted 25 March, 2019; originally announced March 2019.

  27. arXiv:1605.02066  [pdf, other

    cs.CV

    Shape from Mixed Polarization

    Authors: Vage Taamazyan, Achuta Kadambi, Ramesh Raskar

    Abstract: Shape from Polarization (SfP) estimates surface normals using photos captured at different polarizer rotations. Fundamentally, the SfP model assumes that light is reflected either diffusely or specularly. However, this model is not valid for many real-world surfaces exhibiting a mixture of diffuse and specular properties. To address this challenge, previous methods have used a sequential solution:… ▽ More

    Submitted 11 June, 2016; v1 submitted 5 May, 2016; originally announced May 2016.

    Comments: 13 pages, 5 figures

  28. arXiv:1503.01804  [pdf, other

    cs.CV cs.GR

    Frequency Domain TOF: Encoding Object Depth in Modulation Frequency

    Authors: Achuta Kadambi, Vage Taamazyan, Suren Jayasuriya, Ramesh Raskar

    Abstract: Time of flight cameras may emerge as the 3-D sensor of choice. Today, time of flight sensors use phase-based sampling, where the phase delay between emitted and received, high-frequency signals encodes distance. In this paper, we present a new time of flight architecture that relies only on frequency---we refer to this technique as frequency-domain time of flight (FD-TOF). Inspired by optical cohe… ▽ More

    Submitted 5 March, 2015; originally announced March 2015.

    Comments: 10 pages

  29. arXiv:1501.04878   

    cs.CV

    A Light Transport Model for Mitigating Multipath Interference in TOF Sensors

    Authors: Nikhil Naik, Achuta Kadambi, Christoph Rhemann, Shahram Izadi, Ramesh Raskar, Sing Bing Kang

    Abstract: Continuous-wave Time-of-flight (TOF) range imaging has become a commercially viable technology with many applications in computer vision and graphics. However, the depth images obtained from TOF cameras contain scene dependent errors due to multipath interference (MPI). Specifically, MPI occurs when multiple optical reflections return to a single spatial location on the imaging sensor. Many prior… ▽ More

    Submitted 30 January, 2015; v1 submitted 20 January, 2015; originally announced January 2015.

    Comments: This paper has been withdrawn by the submitter as the submission was made due to a miscommunication

  30. arXiv:1404.1116  [pdf, other

    cs.CV cs.IT physics.optics

    Resolving Multi-path Interference in Time-of-Flight Imaging via Modulation Frequency Diversity and Sparse Regularization

    Authors: Ayush Bhandari, Achuta Kadambi, Refael Whyte, Christopher Barsi, Micha Feigin, Adrian Dorrington, Ramesh Raskar

    Abstract: Time-of-flight (ToF) cameras calculate depth maps by reconstructing phase shifts of amplitude-modulated signals. For broad illumination or transparent objects, reflections from multiple scene points can illuminate a given pixel, giving rise to an erroneous depth map. We report here a sparsity regularized solution that separates K-interfering components using multiple modulation frequency measureme… ▽ More

    Submitted 3 April, 2014; originally announced April 2014.

    Comments: 11 Pages, 4 figures, appeared with minor changes in Optics Letters

    Journal ref: Optics Letters, Vol. 39, Issue 6, pp. 1705-1708 (2014)