subscribe to arXiv mailings

Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Authors: Ankit Dhiman, Manan Shah, Rishubh Parihar, Yash Bhalgat, Lokesh R Boregowda, R Venkatesh Babu

Abstract: We tackle the problem of generating highly realistic and plausible mirror reflections using diffusion-based generative models. We formulate this problem as an image inpainting task, allowing for more user control over the placement of mirrors during the generation process. To enable this, we create SynMirror, a large-scale dataset of diverse synthetic scenes with objects placed in front of mirrors… ▽ More We tackle the problem of generating highly realistic and plausible mirror reflections using diffusion-based generative models. We formulate this problem as an image inpainting task, allowing for more user control over the placement of mirrors during the generation process. To enable this, we create SynMirror, a large-scale dataset of diverse synthetic scenes with objects placed in front of mirrors. SynMirror contains around 198K samples rendered from 66K unique 3D objects, along with their associated depth maps, normal maps and instance-wise segmentation masks, to capture relevant geometric properties of the scene. Using this dataset, we propose a novel depth-conditioned inpainting method called MirrorFusion, which generates high-quality geometrically consistent and photo-realistic mirror reflections given an input image and a mask depicting the mirror region. MirrorFusion outperforms state-of-the-art methods on SynMirror, as demonstrated by extensive quantitative and qualitative analysis. To the best of our knowledge, we are the first to successfully tackle the challenging problem of generating controlled and faithful mirror reflections of an object in a scene using diffusion based models. SynMirror and MirrorFusion open up new avenues for image editing and augmented reality applications for practitioners and researchers alike. △ Less

Submitted 22 September, 2024; originally announced September 2024.

Comments: Project Page: https://val.cds.iisc.ac.in/reflecting-reality.github.io/

arXiv:2406.07676 [pdf, other]

FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation

Authors: Swarup Ranjan Behera, Abhishek Dhiman, Karthik Gowda, Aalekhya Satya Narayani

Abstract: Audio classification models, particularly the Audio Spectrogram Transformer (AST), play a crucial role in efficient audio analysis. However, optimizing their efficiency without compromising accuracy remains a challenge. In this paper, we introduce FastAST, a framework that integrates Token Merging (ToMe) into the AST framework. FastAST enhances inference speed without requiring extensive retrainin… ▽ More Audio classification models, particularly the Audio Spectrogram Transformer (AST), play a crucial role in efficient audio analysis. However, optimizing their efficiency without compromising accuracy remains a challenge. In this paper, we introduce FastAST, a framework that integrates Token Merging (ToMe) into the AST framework. FastAST enhances inference speed without requiring extensive retraining by merging similar tokens in audio spectrograms. Furthermore, during training, FastAST brings about significant speed improvements. The experiments indicate that FastAST can increase audio classification throughput with minimal impact on accuracy. To mitigate the accuracy impact, we integrate Cross-Model Knowledge Distillation (CMKD) into the FastAST framework. Integrating ToMe and CMKD into AST results in improved accuracy compared to AST while maintaining faster inference speeds. FastAST represents a step towards real-time, resource-efficient audio analysis. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Accepted to Interspeech 2024

MSC Class: 68T10

arXiv:2312.14870 [pdf, other]

Numerical Reasoning for Financial Reports

Authors: Abhinav Arun, Ashish Dhiman, Mehul Soni, Yibei Hu

Abstract: Financial reports offer critical insights into a company's operations, yet their extensive length typically spanning 30 40 pages poses challenges for swift decision making in dynamic markets. To address this, we leveraged finetuned Large Language Models (LLMs) to distill key indicators and operational metrics from these reports basis questions from the user. We devised a method to locate critical… ▽ More Financial reports offer critical insights into a company's operations, yet their extensive length typically spanning 30 40 pages poses challenges for swift decision making in dynamic markets. To address this, we leveraged finetuned Large Language Models (LLMs) to distill key indicators and operational metrics from these reports basis questions from the user. We devised a method to locate critical data, and leverage the FinQA dataset to fine-tune both Llama-2 7B and T5 models for customized question answering. We achieved results comparable to baseline on the final numerical answer, a competitive accuracy in numerical reasoning and calculation. △ Less

Submitted 22 December, 2023; originally announced December 2023.

Comments: 10 pages, 11 figures, 6 tables

arXiv:2312.06711 [pdf, other]

Physics Informed Neural Network for Option Pricing

Authors: Ashish Dhiman, Yibei Hu

Abstract: We apply a physics-informed deep-learning approach the PINN approach to the Black-Scholes equation for pricing American and European options. We test our approach on both simulated as well as real market data, compare it to analytical/numerical benchmarks. Our model is able to accurately capture the price behaviour on simulation data, while also exhibiting reasonable performance for market data. W… ▽ More We apply a physics-informed deep-learning approach the PINN approach to the Black-Scholes equation for pricing American and European options. We test our approach on both simulated as well as real market data, compare it to analytical/numerical benchmarks. Our model is able to accurately capture the price behaviour on simulation data, while also exhibiting reasonable performance for market data. We also experiment with the architecture and learning process of our PINN model to provide more understanding of convergence and stability issues that impact performance. △ Less

Submitted 10 December, 2023; originally announced December 2023.

Comments: 7 pages + references

arXiv:2309.07668 [pdf, other]

CoRF : Colorizing Radiance Fields using Knowledge Distillation

Authors: Ankit Dhiman, R Srinath, Srinjay Sarkar, Lokesh R Boregowda, R Venkatesh Babu

Abstract: Neural radiance field (NeRF) based methods enable high-quality novel-view synthesis for multi-view images. This work presents a method for synthesizing colorized novel views from input grey-scale multi-view images. When we apply image or video-based colorization methods on the generated grey-scale novel views, we observe artifacts due to inconsistency across views. Training a radiance field networ… ▽ More Neural radiance field (NeRF) based methods enable high-quality novel-view synthesis for multi-view images. This work presents a method for synthesizing colorized novel views from input grey-scale multi-view images. When we apply image or video-based colorization methods on the generated grey-scale novel views, we observe artifacts due to inconsistency across views. Training a radiance field network on the colorized grey-scale image sequence also does not solve the 3D consistency issue. We propose a distillation based method to transfer color knowledge from the colorization networks trained on natural images to the radiance field network. Specifically, our method uses the radiance field network as a 3D representation and transfers knowledge from existing 2D colorization methods. The experimental results demonstrate that the proposed method produces superior colorized novel views for indoor and outdoor scenes while maintaining cross-view consistency than baselines. Further, we show the efficacy of our method on applications like colorization of radiance field network trained from 1.) Infra-Red (IR) multi-view images and 2.) Old grey-scale multi-view image sequences. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: AI3DCC @ ICCV 2023

arXiv:2308.10337 [pdf, other]

Strata-NeRF : Neural Radiance Fields for Stratified Scenes

Authors: Ankit Dhiman, Srinath R, Harsh Rangwani, Rishubh Parihar, Lokesh R Boregowda, Srinath Sridhar, R Venkatesh Babu

Abstract: Neural Radiance Field (NeRF) approaches learn the underlying 3D representation of a scene and generate photo-realistic novel views with high fidelity. However, most proposed settings concentrate on modelling a single object or a single level of a scene. However, in the real world, we may capture a scene at multiple levels, resulting in a layered capture. For example, tourists usually capture a mon… ▽ More Neural Radiance Field (NeRF) approaches learn the underlying 3D representation of a scene and generate photo-realistic novel views with high fidelity. However, most proposed settings concentrate on modelling a single object or a single level of a scene. However, in the real world, we may capture a scene at multiple levels, resulting in a layered capture. For example, tourists usually capture a monument's exterior structure before capturing the inner structure. Modelling such scenes in 3D with seamless switching between levels can drastically improve immersive experiences. However, most existing techniques struggle in modelling such scenes. We propose Strata-NeRF, a single neural radiance field that implicitly captures a scene with multiple levels. Strata-NeRF achieves this by conditioning the NeRFs on Vector Quantized (VQ) latent representations which allow sudden changes in scene structure. We evaluate the effectiveness of our approach in multi-layered synthetic dataset comprising diverse scenes and then further validate its generalization on the real-world RealEstate10K dataset. We find that Strata-NeRF effectively captures stratified scenes, minimizes artifacts, and synthesizes high-fidelity views compared to existing approaches. △ Less

Submitted 20 August, 2023; originally announced August 2023.

Comments: ICCV 2023, Project Page: https://ankitatiisc.github.io/Strata-NeRF/

arXiv:2308.06882 [pdf, other]

Quantifying Outlierness of Funds from their Categories using Supervised Similarity

Authors: Dhruv Desai, Ashmita Dhiman, Tushar Sharma, Deepika Sharma, Dhagash Mehta, Stefano Pasquali

Abstract: Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. H… ▽ More Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. Here, we aim to quantify the effect of miscategorization of funds utilizing a machine learning based approach. We formulate the problem of miscategorization of funds as a distance-based outlier detection problem, where the outliers are the data-points that are far from the rest of the data-points in the given feature space. We implement and employ a Random Forest (RF) based method of distance metric learning, and compute the so-called class-wise outlier measures for each data-point to identify outliers in the data. We test our implementation on various publicly available data sets, and then apply it to mutual fund data. We show that there is a strong relationship between the outlier measures of the funds and their future returns and discuss the implications of our findings. △ Less

Submitted 13 August, 2023; originally announced August 2023.

Comments: 8 pages, 5 tables, 8 figures

arXiv:2305.04967 [pdf, other]

UQ for Credit Risk Management: A deep evidence regression approach

Authors: Ashish Dhiman

Abstract: Machine Learning has invariantly found its way into various Credit Risk applications. Due to the intrinsic nature of Credit Risk, quantifying the uncertainty of the predicted risk metrics is essential, and applying uncertainty-aware deep learning models to credit risk settings can be very helpful. In this work, we have explored the application of a scalable UQ-aware deep learning technique, Deep E… ▽ More Machine Learning has invariantly found its way into various Credit Risk applications. Due to the intrinsic nature of Credit Risk, quantifying the uncertainty of the predicted risk metrics is essential, and applying uncertainty-aware deep learning models to credit risk settings can be very helpful. In this work, we have explored the application of a scalable UQ-aware deep learning technique, Deep Evidence Regression and applied it to predicting Loss Given Default. We contribute to the literature by extending the Deep Evidence Regression methodology to learning target variables generated by a Weibull process and provide the relevant learning framework. We demonstrate the application of our approach to both simulated and real-world data. △ Less

Submitted 17 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: 9 pages, plus references

arXiv:2207.09855 [pdf, other]

doi 10.1145/3503161.3547972

Everything is There in Latent Space: Attribute Editing and Attribute Style Manipulation by StyleGAN Latent Space Exploration

Authors: Rishubh Parihar, Ankit Dhiman, Tejan Karmali, R. Venkatesh Babu

Abstract: Unconstrained Image generation with high realism is now possible using recent Generative Adversarial Networks (GANs). However, it is quite challenging to generate images with a given set of attributes. Recent methods use style-based GAN models to perform image editing by leveraging the semantic hierarchy present in the layers of the generator. We present Few-shot Latent-based Attribute Manipulatio… ▽ More Unconstrained Image generation with high realism is now possible using recent Generative Adversarial Networks (GANs). However, it is quite challenging to generate images with a given set of attributes. Recent methods use style-based GAN models to perform image editing by leveraging the semantic hierarchy present in the layers of the generator. We present Few-shot Latent-based Attribute Manipulation and Editing (FLAME), a simple yet effective framework to perform highly controlled image editing by latent space manipulation. Specifically, we estimate linear directions in the latent space (of a pre-trained StyleGAN) that controls semantic attributes in the generated image. In contrast to previous methods that either rely on large-scale attribute labeled datasets or attribute classifiers, FLAME uses minimal supervision of a few curated image pairs to estimate disentangled edit directions. FLAME can perform both individual and sequential edits with high precision on a diverse set of images while preserving identity. Further, we propose a novel task of Attribute Style Manipulation to generate diverse styles for attributes such as eyeglass and hair. We first encode a set of synthetic images of the same identity but having different attribute styles in the latent space to estimate an attribute style manifold. Sampling a new latent from this manifold will result in a new attribute style in the generated image. We propose a novel sampling method to sample latent from the manifold, enabling us to generate a diverse set of attribute styles beyond the styles present in the training set. FLAME can generate diverse attribute styles in a disentangled manner. We illustrate the superior performance of FLAME against previous image editing methods by extensive qualitative and quantitative comparisons. FLAME also generalizes well on multiple datasets such as cars and churches. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: Project page: https://sites.google.com/view/flamelatentediting

arXiv:2204.05956 [pdf]

doi 10.1063/9.0000339

Magnetization statics and dynamics in (Ir/Co/Pt)$_6$ multilayers with Dzyaloshinskii-Moriya interaction

Authors: A. K. Dhiman, R. Gieniusz, P. Gruszecki, J. Kisielewski, M. Matczak, Z. Kurant, I. Sveklo, U. Guzowska, M. Tekielak, F. Stobiecki, A. Maziewski

Abstract: Magnetic multilayers of (Ir/Co/Pt)$_6$ with interfacial Dzyaloshinskii-Moriya interaction (IDMI) were deposited by magnetron sputtering with Co thickness $d=1.8$ nm. Exploiting magneto-optical Kerr effect in longitudinal mode microscopy, magnetic force microscopy, and vibrating sample magnetometry, the magnetic field-driven evolution of domain structures and magnetization hysteresis loops have bee… ▽ More Magnetic multilayers of (Ir/Co/Pt)$_6$ with interfacial Dzyaloshinskii-Moriya interaction (IDMI) were deposited by magnetron sputtering with Co thickness $d=1.8$ nm. Exploiting magneto-optical Kerr effect in longitudinal mode microscopy, magnetic force microscopy, and vibrating sample magnetometry, the magnetic field-driven evolution of domain structures and magnetization hysteresis loops have been studied. The existence of weak stripe domains structure was deduced -- tens micrometers size domains with in-plane "core" magnetization modulated by hundred of nanometers domains with out-of-plane magnetization. Micromagnetic simulations interpreted such magnetization distribution. Quantitative evaluation of IDMI was carried out using Brillouin light scattering (BLS) spectroscopy as the difference between Stokes and anti-Stokes peak frequencies $Δf$. Due to the additive nature of IDMI, the asymmetric combination of Ir and Pt covers led to large values of effective IDMI energy density $D_\mathrm{eff}$. It was found that Stokes and anti-Stokes frequencies as well as $Δf$, measured as a function of in-plane applied magnetic field, show hysteresis. These results are explained under the consideration of the influence of IDMI on the dynamics of the in-plane magnetized "core" with weak stripe domains. △ Less

Submitted 12 April, 2022; originally announced April 2022.

Journal ref: AIP Advances 12, 045007 (2022)

arXiv:2007.06511 [pdf, other]

An Enhanced Text Classification to Explore Health based Indian Government Policy Tweets

Authors: Aarzoo Dhiman, Durga Toshniwal

Abstract: Government-sponsored policy-making and scheme generations is one of the means of protecting and promoting the social, economic, and personal development of the citizens. The evaluation of effectiveness of these schemes done by government only provide the statistical information in terms of facts and figures which do not include the in-depth knowledge of public perceptions, experiences and views on… ▽ More Government-sponsored policy-making and scheme generations is one of the means of protecting and promoting the social, economic, and personal development of the citizens. The evaluation of effectiveness of these schemes done by government only provide the statistical information in terms of facts and figures which do not include the in-depth knowledge of public perceptions, experiences and views on the topic. In this research work, we propose an improved text classification framework that classifies the Twitter data of different health-based government schemes. The proposed framework leverages the language representation models (LR models) BERT, ELMO, and USE. However, these LR models have less real-time applicability due to the scarcity of the ample annotated data. To handle this, we propose a novel GloVe word embeddings and class-specific sentiments based text augmentation approach (named Mod-EDA) which boosts the performance of text classification task by increasing the size of labeled data. Furthermore, the trained model is leveraged to identify the level of engagement of citizens towards these policies in different communities such as middle-income and low-income groups. △ Less

Submitted 18 August, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

Comments: Accepted to KDD 2020: Applied Data Science for Healthcare Workshop (4 pages, 2 figures, 2 tables)

ACM Class: I.2.7; H.2.8; I.5.4

arXiv:2004.11663 [pdf, other]

doi 10.1145/3408995

Retrofitting Parallelism onto OCaml

Authors: KC Sivaramakrishnan, Stephen Dolan, Leo White, Sadiq Jaffer, Tom Kelly, Anmol Sahoo, Sudha Parimala, Atul Dhiman, Anil Madhavapeddy

Abstract: OCaml is an industrial-strength, multi-paradigm programming language, widely used in industry and academia. OCaml is also one of the few modern managed system programming languages to lack support for shared memory parallel programming. This paper describes the design, a full-fledged implementation and evaluation of a mostly-concurrent garbage collector (GC) for the multicore extension of the OCam… ▽ More OCaml is an industrial-strength, multi-paradigm programming language, widely used in industry and academia. OCaml is also one of the few modern managed system programming languages to lack support for shared memory parallel programming. This paper describes the design, a full-fledged implementation and evaluation of a mostly-concurrent garbage collector (GC) for the multicore extension of the OCaml programming language. Given that we propose to add parallelism to a widely used programming language with millions of lines of existing code, we face the challenge of maintaining backwards compatibility--not just in terms of the language features but also the performance of single-threaded code running with the new GC. To this end, the paper presents a series of novel techniques and demonstrates that the new GC strikes a balance between performance and feature backwards compatibility for sequential programs and scales admirably on modern multicore processors. △ Less

Submitted 2 July, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

Comments: Accepted to ICFP 2020

ACM Class: D.3.4

arXiv:1909.04916 [pdf, ps, other]

Method of variation of parameters revisited

Authors: Swarup Poria, Aman Dhiman

Abstract: The method of variation of parameter (VOP) for solving linear ordinary differential equation is revisited in this article. Historically, Lagrange and Euler explained the method of variation of parameter in the context of perturbation method. In this article, we explain the construction of particular solutions of a linear ordinary differential equation in the light of linearly independent functions… ▽ More The method of variation of parameter (VOP) for solving linear ordinary differential equation is revisited in this article. Historically, Lagrange and Euler explained the method of variation of parameter in the context of perturbation method. In this article, we explain the construction of particular solutions of a linear ordinary differential equation in the light of linearly independent functions in a more systematic way. In addition, we have shown that if the time variation of the constants contribute substantially to the velocity then also the solution remains invariant. VOP method for system of n linear ODE is discussed. Duhamels principle has also been studied in reference to a system of n linear ODE for completeness of this review. Finally, applications of VOP method for constructing Green's function is reported. △ Less

Submitted 11 September, 2019; originally announced September 2019.

Comments: 12 pages, 1 figure

arXiv:1605.05317 [pdf]

Existence and uniqueness theorem for ODE: an overview

Authors: Swarup Poria, Aman Dhiman

Abstract: The study of existence and uniqueness of solutions became important due to the lack of general formula for solving nonlinear ordinary differential equations (ODEs). Compact form of existence and uniqueness theory appeared nearly 200 years after the development of the theory of differential equation. In the article, we shall discuss briefly the differences between linear and nonlinear first order O… ▽ More The study of existence and uniqueness of solutions became important due to the lack of general formula for solving nonlinear ordinary differential equations (ODEs). Compact form of existence and uniqueness theory appeared nearly 200 years after the development of the theory of differential equation. In the article, we shall discuss briefly the differences between linear and nonlinear first order ODE in context of existence and uniqueness of solutions. Special emphasis is given on the Lipschitz continuous functions in the discussion. △ Less

Submitted 17 May, 2016; originally announced May 2016.

Comments: 10 pages

arXiv:1411.6034 [pdf]

A Nano-satellite Mission to Study Charged Particle Precipitation from the Van Allen Radiation Belts caused due to Seismo-Electromagnetic Emissions

Authors: Nithin Sivadas, Akshay Gulati, Deepti Kannapan, Ananth Saran Yalamarthy, Ankit Dhiman, Arjun Bhagoji, Athreya Shankar, Nitin Prasad, Harishankar Ramachandran, R. David Koilpillai

Abstract: In the past decade, several attempts have been made to study the effects of seismo-electromagnetic emissions - an earthquake precursor, on the ionosphere and the radiation belts. The IIT Madras nano-satellite (IITMSAT) mission is designed to make sensitive measurements of charged particle fluxes in a Low Earth Orbit to study the nature of charged particle precipitation from the Van Allen radiation… ▽ More In the past decade, several attempts have been made to study the effects of seismo-electromagnetic emissions - an earthquake precursor, on the ionosphere and the radiation belts. The IIT Madras nano-satellite (IITMSAT) mission is designed to make sensitive measurements of charged particle fluxes in a Low Earth Orbit to study the nature of charged particle precipitation from the Van Allen radiation belts caused due to such emissions. With the Space-based Proton Electron Energy Detector on-board a single nano-satellite, the mission will attempt to gather statistically significant data to verify possible correlations with seismo-electromagnetic emissions before major earthquakes. △ Less

Submitted 21 November, 2014; originally announced November 2014.

Comments: 6 pages, 3 figures, Submitted to and accepted at The 5th Nano-Satellite Symposium

Showing 1–15 of 15 results for author: Dhiman, A