-
Structural Causality-based Generalizable Concept Discovery Models
Authors:
Sanchit Sinha,
Guangzhi Xiong,
Aidong Zhang
Abstract:
The rising need for explainable deep neural network architectures has utilized semantic concepts as explainable units. Several approaches utilizing disentangled representation learning estimate the generative factors and utilize them as concepts for explaining DNNs. However, even though the generative factors for a dataset remain fixed, concepts are not fixed entities and vary based on downstream…
▽ More
The rising need for explainable deep neural network architectures has utilized semantic concepts as explainable units. Several approaches utilizing disentangled representation learning estimate the generative factors and utilize them as concepts for explaining DNNs. However, even though the generative factors for a dataset remain fixed, concepts are not fixed entities and vary based on downstream tasks. In this paper, we propose a disentanglement mechanism utilizing a variational autoencoder (VAE) for learning mutually independent generative factors for a given dataset and subsequently learning task-specific concepts using a structural causal model (SCM). Our method assumes generative factors and concepts to form a bipartite graph, with directed causal edges from generative factors to concepts. Experiments are conducted on datasets with known generative factors: D-sprites and Shapes3D. On specific downstream tasks, our proposed method successfully learns task-specific concepts which are explained well by the causal edges from the generative factors. Lastly, separate from current causal concept discovery methods, our methodology is generalizable to an arbitrary number of concepts and flexible to any downstream tasks.
△ Less
Submitted 20 October, 2024;
originally announced October 2024.
-
Plasma-Metal Junction:A Junction With Negative Turn-On Voltage
Authors:
Sneha Latha Kommuguri,
Smrutishree Pratihary,
Thangjam Rishikanta Singh,
Suraj Kumar Sinha
Abstract:
Unlike junctions in solid-state devices, a plasma-metal junction (pm-junction) is a junction of classical and quantum electrons. The plasma electrons are Maxwellain in nature, while metal electrons obey the Fermi-Dirac distribution. In this experiment, the current-voltage characteristics of solid-state devices that form homo or hetero-junction are compared to the pm-junction. Observation shows tha…
▽ More
Unlike junctions in solid-state devices, a plasma-metal junction (pm-junction) is a junction of classical and quantum electrons. The plasma electrons are Maxwellain in nature, while metal electrons obey the Fermi-Dirac distribution. In this experiment, the current-voltage characteristics of solid-state devices that form homo or hetero-junction are compared to the pm-junction. Observation shows that the turn-on voltage for pn-junction is 0.5V and decreases to 0.24V for metal-semiconductor junction. However, the pm-junction's turn-on voltage was lowered to a negative value of -7.0V. The devices with negative turn-on voltage are suitable for high-frequency operations. Further, observations show that the current-voltage characteristics of the pm-junction depend on the metal's work function, and the turn-on voltage remains unchanged. This result validates the applicability of the energy-band model for the pm-junction. We present a perspective metal-oxide-plasma (MOP), a gaseous electronic device, as an alternative to metal-oxide-semiconductor (MOS), based on the new understanding developed.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Label-free prediction of fluorescence markers in bovine satellite cells using deep learning
Authors:
Sania Sinha,
Aarham Wasit,
Won Seob Kim,
Jongkyoo Kim,
Jiyoon Yi
Abstract:
Assessing the quality of bovine satellite cells (BSCs) is essential for the cultivated meat industry, which aims to address global food sustainability challenges. This study aims to develop a label-free method for predicting fluorescence markers in isolated BSCs using deep learning. We employed a U-Net-based CNN model to predict multiple fluorescence signals from a single bright-field microscopy i…
▽ More
Assessing the quality of bovine satellite cells (BSCs) is essential for the cultivated meat industry, which aims to address global food sustainability challenges. This study aims to develop a label-free method for predicting fluorescence markers in isolated BSCs using deep learning. We employed a U-Net-based CNN model to predict multiple fluorescence signals from a single bright-field microscopy image of cell culture. Two key biomarkers, DAPI and Pax7, were used to determine the abundance and quality of BSCs. The image pre-processing pipeline included fluorescence denoising to improve prediction performance and consistency. A total of 48 biological replicates were used, with statistical performance metrics such as Pearson correlation coefficient and SSIM employed for model evaluation. The model exhibited better performance with DAPI predictions due to uniform staining. Pax7 predictions were more variable, reflecting biological heterogeneity. Enhanced visualization techniques, including color mapping and image overlay, improved the interpretability of the predictions by providing better contextual and perceptual information. The findings highlight the importance of data pre-processing and demonstrate the potential of deep learning to advance non-invasive, label-free assessment techniques in the cultivated meat industry, paving the way for reliable and actionable AI-driven evaluations.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
On a relative dependency formula
Authors:
Shashi Ranjan Sinha
Abstract:
Celikbas, Liang and Sadeghi established a one-sided inequality for the relative version of Jorgensen's dependency formula and questioned whether it would be an equality. In this paper, we show that the inequality can be indeed strict, and prove a relative dependency formula. Along the way, we obtain some bounds on s(M,N), a notion related to the vanishing of relative homology of finitely generated…
▽ More
Celikbas, Liang and Sadeghi established a one-sided inequality for the relative version of Jorgensen's dependency formula and questioned whether it would be an equality. In this paper, we show that the inequality can be indeed strict, and prove a relative dependency formula. Along the way, we obtain some bounds on s(M,N), a notion related to the vanishing of relative homology of finitely generated modules M and N over a local ring R, under specific assumptions.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights
Authors:
Rahul Krishna,
Rangeet Pan,
Raju Pavuluri,
Srikanth Tamilselvam,
Maja Vukovic,
Saurabh Sinha
Abstract:
Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of functionalities such as code completion, code generation, code summarization, test generation, code translation, and more. To leverage code LLMs to their full potential, developers must provide code-specific contextual information to the models. These are typically derived a…
▽ More
Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of functionalities such as code completion, code generation, code summarization, test generation, code translation, and more. To leverage code LLMs to their full potential, developers must provide code-specific contextual information to the models. These are typically derived and distilled using program analysis tools. However, there exists a significant gap--these static analysis tools are often language-specific and come with a steep learning curve, making their effective use challenging. These tools are tailored to specific program languages, requiring developers to learn and manage multiple tools to cover various aspects of the their code base. Moreover, the complexity of configuring and integrating these tools into the existing development environments add an additional layer of difficulty. This challenge limits the potential benefits that could be gained from more widespread and effective use of static analysis in conjunction with LLMs.
To address this challenge, we present codellm-devkit (hereafter, `CLDK'), an open-source library that significantly simplifies the process of performing program analysis at various levels of granularity for different programming languages to support code LLM use cases. As a Python library, CLDK offers developers an intuitive and user-friendly interface, making it incredibly easy to provide rich program analysis context to code LLMs. With this library, developers can effortlessly integrate detailed, code-specific insights that enhance the operational efficiency and effectiveness of LLMs in coding tasks. CLDK is available as an open-source library at https://github.com/IBM/codellm-devkit.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
Hamiltonian bridge: A physics-driven generative framework for targeted pattern control
Authors:
Vishaal Krishnan,
Sumit Sinha,
L. Mahadevan
Abstract:
Patterns arise spontaneously in a range of systems spanning the sciences, and their study typically focuses on mechanisms to understand their evolution in space-time. Increasingly, there has been a transition towards controlling these patterns in various functional settings, with implications for engineering. Here, we combine our knowledge of a general class of dynamical laws for pattern formation…
▽ More
Patterns arise spontaneously in a range of systems spanning the sciences, and their study typically focuses on mechanisms to understand their evolution in space-time. Increasingly, there has been a transition towards controlling these patterns in various functional settings, with implications for engineering. Here, we combine our knowledge of a general class of dynamical laws for pattern formation in non-equilibrium systems, and the power of stochastic optimal control approaches to present a framework that allows us to control patterns at multiple scales, which we dub the "Hamiltonian bridge". We use a mapping between stochastic many-body Lagrangian physics and deterministic Eulerian pattern forming PDEs to leverage our recent approach utilizing the Feynman-Kac-based adjoint path integral formulation for the control of interacting particles and generalize this to the active control of patterning fields. We demonstrate the applicability of our computational framework via numerical experiments on the control of phase separation with and without a conserved order parameter, self-assembly of fluid droplets, coupled reaction-diffusion equations and finally a phenomenological model for spatio-temporal tissue differentiation. We interpret our numerical experiments in terms of a theoretical understanding of how the underlying physics shapes the geometry of the pattern manifold, altering the transport paths of patterns and the nature of pattern interpolation. We finally conclude by showing how optimal control can be utilized to generate complex patterns via an iterative control protocol over pattern forming pdes which can be casted as gradient flows. All together, our study shows how we can systematically build in physical priors into a generative framework for pattern control in non-equilibrium systems across multiple length and time scales.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
REPeat: A Real2Sim2Real Approach for Pre-acquisition of Soft Food Items in Robot-assisted Feeding
Authors:
Nayoung Ha,
Ruolin Ye,
Ziang Liu,
Shubhangi Sinha,
Tapomayukh Bhattacharjee
Abstract:
The paper presents REPeat, a Real2Sim2Real framework designed to enhance bite acquisition in robot-assisted feeding for soft foods. It uses `pre-acquisition actions' such as pushing, cutting, and flipping to improve the success rate of bite acquisition actions such as skewering, scooping, and twirling. If the data-driven model predicts low success for direct bite acquisition, the system initiates…
▽ More
The paper presents REPeat, a Real2Sim2Real framework designed to enhance bite acquisition in robot-assisted feeding for soft foods. It uses `pre-acquisition actions' such as pushing, cutting, and flipping to improve the success rate of bite acquisition actions such as skewering, scooping, and twirling. If the data-driven model predicts low success for direct bite acquisition, the system initiates a Real2Sim phase, reconstructing the food's geometry in a simulation. The robot explores various pre-acquisition actions in the simulation, then a Sim2Real step renders a photorealistic image to reassess success rates. If the success improves, the robot applies the action in reality. We evaluate the system on 15 diverse plates with 10 types of food items for a soft food diet, showing improvement in bite acquisition success rates by 27\% on average across all plates. See our project website at https://emprise.cs.cornell.edu/repeat.
△ Less
Submitted 13 October, 2024;
originally announced October 2024.
-
Suitability Analysis of Ground Motion Prediction Equations for Western and Central Himalayas and Indo-Gangetic Plains
Authors:
S. Selvan,
Suman Sinha
Abstract:
Ground motion prediction equations (GMPEs) play a key role in seismic hazard assessment (SHA). Considering the seismo-tectonic, geophysical and geotectonic characteristics of a target region, all the GMPEs may not be suitable in predicting the observed ground motion effectively. With a fairly large number of published GMPEs, the selection and ranking of suitable GMPEs for the design of logic trees…
▽ More
Ground motion prediction equations (GMPEs) play a key role in seismic hazard assessment (SHA). Considering the seismo-tectonic, geophysical and geotectonic characteristics of a target region, all the GMPEs may not be suitable in predicting the observed ground motion effectively. With a fairly large number of published GMPEs, the selection and ranking of suitable GMPEs for the design of logic trees in SHA for a particular target region has become a necessity of late. This paper presents a detailed quantitative evaluation of performance of 16 GMPEs against recorded ground motion data in two target regions, characterized by distinct seismo-tectonic, geophysical and geotectonical nature. The data set comprises of 465 three-component spectral accelerograms corresponding to 122 earthquake events. The suitability of a GMPE is tested by two widely accepted data-driven statistical methods, namely, likelihood (LH) and log-likelihood (LLH) method. Different suites of GMPEs are shown suitable for different periods of interest. The results will be useful to scientists and engineers for microzonation and estimation of seismic design parameters for the design of earthquake-resistant structures in these regions.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
ProtoNAM: Prototypical Neural Additive Models for Interpretable Deep Tabular Learning
Authors:
Guangzhi Xiong,
Sanchit Sinha,
Aidong Zhang
Abstract:
Generalized additive models (GAMs) have long been a powerful white-box tool for the intelligible analysis of tabular data, revealing the influence of each feature on the model predictions. Despite the success of neural networks (NNs) in various domains, their application as NN-based GAMs in tabular data analysis remains suboptimal compared to tree-based ones, and the opacity of encoders in NN-GAMs…
▽ More
Generalized additive models (GAMs) have long been a powerful white-box tool for the intelligible analysis of tabular data, revealing the influence of each feature on the model predictions. Despite the success of neural networks (NNs) in various domains, their application as NN-based GAMs in tabular data analysis remains suboptimal compared to tree-based ones, and the opacity of encoders in NN-GAMs also prevents users from understanding how networks learn the functions. In this work, we propose a new deep tabular learning method, termed Prototypical Neural Additive Model (ProtoNAM), which introduces prototypes into neural networks in the framework of GAMs. With the introduced prototype-based feature activation, ProtoNAM can flexibly model the irregular mapping from tabular features to the outputs while maintaining the explainability of the final prediction. We also propose a gradient-boosting inspired hierarchical shape function modeling method, facilitating the discovery of complex feature patterns and bringing transparency into the learning process of each network layer. Our empirical evaluations demonstrate that ProtoNAM outperforms all existing NN-based GAMs, while providing additional insights into the shape function learned for each feature. The source code of ProtoNAM is available at \url{https://github.com/Teddy-XiongGZ/ProtoNAM}.
△ Less
Submitted 6 October, 2024;
originally announced October 2024.
-
Compound V3 Economic Audit Report
Authors:
Rik Ghosh,
Samrat Gupta,
Arka Datta,
Abhimanyu Nag,
Sudipan Sinha
Abstract:
Compound Finance is a decentralized lending protocol that enables the secure and efficient borrowing and lending of cryptocurrencies, utilizing smart contracts and dynamic interest rates based on supply and demand to facilitate transactions. The protocol enables users to supply different crypto assets and accrue interest, while borrowers can avail themselves of loans secured by collateralized asse…
▽ More
Compound Finance is a decentralized lending protocol that enables the secure and efficient borrowing and lending of cryptocurrencies, utilizing smart contracts and dynamic interest rates based on supply and demand to facilitate transactions. The protocol enables users to supply different crypto assets and accrue interest, while borrowers can avail themselves of loans secured by collateralized assets. Our collaboration with Compound Finance focuses on harnessing the power of the Chainrisk simulation engine to optimize risk parameters of the Compound V3 (Comet) protocol. This report delineates a comprehensive methodology aimed at calculating key risk metrics of the protocol. This optimization framework is pivotal for mitigating systemic risks and enhancing the overall stability of the protocol. By leveraging Chainrisk's Cloud Platform, we conduct millions of simulations to evaluate the protocol's Value at Risk (VaR) and Liquidations at Risk (LaR), ultimately providing recommendations for parameter adjustments.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Novel electronic state of honeycomb iridate Cu$_2$IrO$_3$ at high pressure
Authors:
G. Fabbris,
E. H. T. Poldi,
S. Sinha,
J. Lim,
T. Elmslie,
J. H. Kim,
A. Said,
M. Upton,
M. Abramchuk,
F. Bahrami,
C. Kenney-Benson,
C. Park,
G. Shen,
Y. K. Vohra,
R. J. Hemley,
J. J. Hamlin,
F. Tafti,
D. Haskel
Abstract:
Cu$_2$IrO$_3$ has attracted recent interest due to its proximity to the Kitaev quantum spin liquid state and the complex structural response observed at high pressures. We use x-ray spectroscopy and scattering as well as electrical transport techniques to unveil the electronic structure of Cu$_2$IrO$_3$ at ambient and high pressures. Despite featuring a $\mathrm{Ir^{4+}}$ $J_{\rm{eff}}=1/2$ state…
▽ More
Cu$_2$IrO$_3$ has attracted recent interest due to its proximity to the Kitaev quantum spin liquid state and the complex structural response observed at high pressures. We use x-ray spectroscopy and scattering as well as electrical transport techniques to unveil the electronic structure of Cu$_2$IrO$_3$ at ambient and high pressures. Despite featuring a $\mathrm{Ir^{4+}}$ $J_{\rm{eff}}=1/2$ state at ambient pressure, Ir $L_{3}$ edge resonant inelastic x-ray scattering reveals broadened electronic excitations that point to the importance of Ir $5d$-Cu $3d$ interaction. High pressure first drives an Ir-Ir dimer state with collapsed $\langle \mathbf{L} \cdot \mathbf{S} \rangle$ and $\langle L_z \rangle/\langle S_z \rangle$, signaling the formation of $5d$ molecular orbitals. A novel $\mathrm{Cu \to Ir}$ charge transfer is observed at the onset of phase 5 above 30 GPa at low temperatures, leading to an approximate $\mathrm{Ir^{3+}}$ and $\mathrm{Cu^{1.5+}}$ valence, with persistent insulating electrical transport seemingly driven by charge segregation of Cu 1+/2+ ions into distinct sites. Concomitant x-ray spectroscopy and scattering measurements through different thermodynamic paths demonstrate a strong electron-lattice coupling, with $J_{\rm{eff}}=1/2$ and $\mathrm{Ir^{3+}}$/$\mathrm{Cu^{1.5+}}$ electronic states occurring only in phases 1 and 5, respectively. Remarkably, the charge-transferred state can only be reached if Cu$_2$IrO$_3$ is pressurized at low temperature, suggesting that phonons play an important role in the stability of this phase. These results point to the choice of thermodynamic path across interplanar collapse transition as a key route to access novel states in intercalated iridates.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
Text2CAD: Generating Sequential CAD Models from Beginner-to-Expert Level Text Prompts
Authors:
Mohammad Sadil Khan,
Sankalp Sinha,
Talha Uddin Sheikh,
Didier Stricker,
Sk Aziz Ali,
Muhammad Zeshan Afzal
Abstract:
Prototyping complex computer-aided design (CAD) models in modern softwares can be very time-consuming. This is due to the lack of intelligent systems that can quickly generate simpler intermediate parts. We propose Text2CAD, the first AI framework for generating text-to-parametric CAD models using designer-friendly instructions for all skill levels. Furthermore, we introduce a data annotation pipe…
▽ More
Prototyping complex computer-aided design (CAD) models in modern softwares can be very time-consuming. This is due to the lack of intelligent systems that can quickly generate simpler intermediate parts. We propose Text2CAD, the first AI framework for generating text-to-parametric CAD models using designer-friendly instructions for all skill levels. Furthermore, we introduce a data annotation pipeline for generating text prompts based on natural language instructions for the DeepCAD dataset using Mistral and LLaVA-NeXT. The dataset contains $\sim170$K models and $\sim660$K text annotations, from abstract CAD descriptions (e.g., generate two concentric cylinders) to detailed specifications (e.g., draw two circles with center $(x,y)$ and radius $r_{1}$, $r_{2}$, and extrude along the normal by $d$...). Within the Text2CAD framework, we propose an end-to-end transformer-based auto-regressive network to generate parametric CAD models from input texts. We evaluate the performance of our model through a mixture of metrics, including visual quality, parametric precision, and geometrical accuracy. Our proposed framework shows great potential in AI-aided design applications. Our source code and annotations will be publicly available.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Deconstructing Solar Super Active Region 13664 in the Context of the Historic Geomagnetic Storm of 2024 May 10-11
Authors:
Priyansh Jaswal,
Suvadip Sinha,
Dibyendu Nandy
Abstract:
The impact of solar-stellar activity on planetary environments is a topic of great interest within the Sun-Earth system as well as exoplanetary systems. In particular, extreme events such as flares and coronal mass ejections have a profound effect on planetary atmospheres. In May this year, a magnetic active region on the Sun (AR 13664) -- with a size exceeding hundred times that of Earth -- unlea…
▽ More
The impact of solar-stellar activity on planetary environments is a topic of great interest within the Sun-Earth system as well as exoplanetary systems. In particular, extreme events such as flares and coronal mass ejections have a profound effect on planetary atmospheres. In May this year, a magnetic active region on the Sun (AR 13664) -- with a size exceeding hundred times that of Earth -- unleashed a large number of high energy X-class flares and associated mass ejections. The resulting Earth impact (geomagnetic storm) on May 10-11 was the strongest in the last two decades. We perform the first comprehensive analysis of the magnetic properties of the active region that spawned these flares and identify this to be a super active region with very rare physical characteristics. We also demonstrate how the rate of energization of the system is related to the flaring process. Our work illuminates how flare productive super active regions on the Sun and stars can be identified and what are their salient physical properties. Specifically, we put AR 13664 in historical context over the cumulative period of 1874 May-2024 June. We find that AR 13664 stands at 99.95 percentile in the distribution of area over 1874 May-2024 June, and at 99.10 percentile in terms of flux content among all ARs over the period 1996 April-2024 June. Our analysis indicates that five of its magnetic properties rank highest among all ARs recorded in SHARP data series during 2010 May-2024 June by the Solar Dynamic Observatory. Furthermore, we demonstrate that AR 13664 reached its most dynamic flare productive state following a rapid rate of rise of its flare-relevant parameters and that the X-class flares it spawned were more frequent near their peak values. Our analyses establish AR 13644 to be solar super active region and provide a paradigm for investigating their flare-relevant physical characteristics.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
On the Effectiveness of Neural Operators at Zero-Shot Weather Downscaling
Authors:
Saumya Sinha,
Brandon Benton,
Patrick Emami
Abstract:
Machine learning (ML) methods have shown great potential for weather downscaling. These data-driven approaches provide a more efficient alternative for producing high-resolution weather datasets and forecasts compared to physics-based numerical simulations. Neural operators, which learn solution operators for a family of partial differential equations (PDEs), have shown great success in scientific…
▽ More
Machine learning (ML) methods have shown great potential for weather downscaling. These data-driven approaches provide a more efficient alternative for producing high-resolution weather datasets and forecasts compared to physics-based numerical simulations. Neural operators, which learn solution operators for a family of partial differential equations (PDEs), have shown great success in scientific ML applications involving physics-driven datasets. Neural operators are grid-resolution-invariant and are often evaluated on higher grid resolutions than they are trained on, i.e., zero-shot super-resolution. Given their promising zero-shot super-resolution performance on dynamical systems emulation, we present a critical investigation of their zero-shot weather downscaling capabilities, which is when models are tasked with producing high-resolution outputs using higher upsampling factors than are seen during training. To this end, we create two realistic downscaling experiments with challenging upsampling factors (e.g., 8x and 15x) across data from different simulations: the European Centre for Medium-Range Weather Forecasts Reanalysis version 5 (ERA5) and the Wind Integration National Dataset Toolkit (WTK). While neural operator-based downscaling models perform better than interpolation and a simple convolutional baseline, we show the surprising performance of an approach that combines a powerful transformer-based model with parameter-free interpolation at zero-shot weather downscaling. We find that this Swin-Transformer-based approach mostly outperforms models with neural operator layers, and suggest its use in future work as a strong baseline.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Computing the $\mathbb{Z}_2$ Invariant in Two-Dimensional Strongly-Correlated Systems
Authors:
Sounak Sinha,
Barry Bradlyn
Abstract:
We show that the two-dimensional $\mathbb{Z}_2$ invariant for time-reversal invariant insulators can be formulated in terms of the boundary-condition dependence of the ground state wavefunction for both non-interacting and strongly-correlated insulators. By introducing a family of quasi-single particle states associated to the many-body ground state of an insulator, we show that the…
▽ More
We show that the two-dimensional $\mathbb{Z}_2$ invariant for time-reversal invariant insulators can be formulated in terms of the boundary-condition dependence of the ground state wavefunction for both non-interacting and strongly-correlated insulators. By introducing a family of quasi-single particle states associated to the many-body ground state of an insulator, we show that the $\mathbb{Z}_2$ invariant can be expressed as the integral of a certain Berry connection over half the space of boundary conditions, providing an alternative expression to the formulations that appear in [Lee et al., Phys. Rev. Lett. $\textbf{100}$, 186807 (2008)]. We show the equivalence of the different many-body formulations of the invariant, and show how they reduce to known band-theoretic results for Slater determinant ground states. Finally, we apply our results to analytically calculate the invariant for the Kane-Mele model with nonlocal (orbital) Hatsugai-Kohmoto (HK) interactions. This rigorously establishes the topological nontriviality of the Kane-Mele model with HK interactions, and represents one of the few exact calculations of the $\mathbb{Z}_2$ invariant for a strongly-interacting system.
△ Less
Submitted 18 September, 2024;
originally announced September 2024.
-
Grid-Forming Storage Networks: Analytical Characterization of Damping and Design Insights
Authors:
Kaustav Chatterjee,
Ramij Raja Hossain,
Sai Pushpak Nandanoori,
Soumya Kundu,
Subhrajit Sinha,
Diane Baldwin,
Ronald Melton
Abstract:
The paper presents a theoretical study on small-signal stability and damping in bulk power systems with multiple grid-forming inverter-based storage resources. A detailed analysis is presented, characterizing the impacts of inverter droop gains and storage size on the slower eigenvalues, particularly those concerning inter-area oscillation modes. From these parametric sensitivity studies, a set of…
▽ More
The paper presents a theoretical study on small-signal stability and damping in bulk power systems with multiple grid-forming inverter-based storage resources. A detailed analysis is presented, characterizing the impacts of inverter droop gains and storage size on the slower eigenvalues, particularly those concerning inter-area oscillation modes. From these parametric sensitivity studies, a set of necessary conditions are derived that the design of droop gain must satisfy to enhance damping performance. The analytical findings are structured into propositions highlighting potential design considerations for improving system stability. The findings are illustrated via numerical studies on an IEEE 68-bus grid-forming storage network.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Multi-language Unit Test Generation using LLMs
Authors:
Rangeet Pan,
Myeongsoo Kim,
Rahul Krishna,
Raju Pavuluri,
Saurabh Sinha
Abstract:
Implementing automated unit tests is an important but time consuming activity in software development. Developers dedicate substantial time to writing tests for validating an application and preventing regressions. To support developers in this task, software engineering research over the past few decades has developed many techniques for automating unit test generation. However, despite this effo…
▽ More
Implementing automated unit tests is an important but time consuming activity in software development. Developers dedicate substantial time to writing tests for validating an application and preventing regressions. To support developers in this task, software engineering research over the past few decades has developed many techniques for automating unit test generation. However, despite this effort, usable tools exist for very few programming languages -- mainly Java, C, and C# and, more recently, for Python. Moreover, studies have found that automatically generated tests suffer poor readability and often do not resemble developer-written tests. In this work, we present a rigorous investigation of how large language models (LLMs) can help bridge the gap. We describe a generic pipeline that incorporates static analysis to guide LLMs in generating compilable and high-coverage test cases. We illustrate how the pipeline can be applied to different programming languages, specifically Java and Python, and to complex software requiring environment mocking. We conducted a through empirical study to assess the quality of the generated tests in terms of coverage, mutation score, and test naturalness -- evaluating them on standard as well as enterprise Java applications and a large Python benchmark. Our results demonstrate that LLM-based test generation, when guided by static analysis, can be competitive with, and even outperform, state-of-the-art test-generation techniques in coverage achieved while also producing considerably more natural test cases that developers find easy to read and understand. We also present the results of a user study, conducted with 161 professional developers, that highlights the naturalness characteristics of the tests generated by our approach.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Superconductivity in pressurized Re$_{0.10}$Mo$_{0.90}$B$_2$
Authors:
S. Sinha,
J. Lim,
Z. Li,
J. S. Kim,
A. C. Hire,
P. M. Dee,
R. S. Kumar,
D. Popov,
R. J. Hemley,
R. G. Hennig,
P. J. Hirschfeld,
G. R. Stewart,
J. J. Hamlin
Abstract:
The recent surprising discovery of superconductivity with critical temperature $T_c$ = 32 K in MoB$_2$ above 70 GPa has led to the search for related materials that may superconduct at similarly high $T_c$ values and lower pressures. We have studied the superconducting and structural properties of Re$_{0.10}$Mo$_{0.90}$B$_2$ to 170 GPa. A structural phase transition from R3m to P6/mmm commences at…
▽ More
The recent surprising discovery of superconductivity with critical temperature $T_c$ = 32 K in MoB$_2$ above 70 GPa has led to the search for related materials that may superconduct at similarly high $T_c$ values and lower pressures. We have studied the superconducting and structural properties of Re$_{0.10}$Mo$_{0.90}$B$_2$ to 170 GPa. A structural phase transition from R3m to P6/mmm commences at 48 GPa, with the first signatures of superconductivity appearing above 44 GPa. The critical temperature is observed to increase with pressure. A complete resistive transition is observed only above 150 GPa, where the highest onset $T_c$ of 30 K is also achieved. Upon releasing pressure, the high pressure superconducting phase is found to be metastable. During unloading, a complete resistive superconducting transition is observed all the way down to 20 GPa (with onset $T_c \sim 20$ K). Our results suggest that the P6/mmm structure is responsible for the observed superconductivity.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Mechanics promotes coherence in heterogeneous active media
Authors:
Soling Zimik,
Sitabhra Sinha
Abstract:
Synchronization of activity among myocytes constituting vital organs, e.g., the heart, is crucial for physiological functions. Self-organized coordination in such heterogeneous ensemble of excitable and oscillatory cells is therefore of clinical importance. We show by varying the strength of intercellular coupling and the electrophysiological diversity, a wide range of collective behavior emerges…
▽ More
Synchronization of activity among myocytes constituting vital organs, e.g., the heart, is crucial for physiological functions. Self-organized coordination in such heterogeneous ensemble of excitable and oscillatory cells is therefore of clinical importance. We show by varying the strength of intercellular coupling and the electrophysiological diversity, a wide range of collective behavior emerges including clusters of synchronized activity. Strikingly, stretch-activated currents allow waves of mechanical deformation to alter the activity of neighboring cells, promoting robust global coherence.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
SpectralGaussians: Semantic, spectral 3D Gaussian splatting for multi-spectral scene representation, visualization and analysis
Authors:
Saptarshi Neil Sinha,
Holger Graf,
Michael Weinmann
Abstract:
We propose a novel cross-spectral rendering framework based on 3D Gaussian Splatting (3DGS) that generates realistic and semantically meaningful splats from registered multi-view spectrum and segmentation maps. This extension enhances the representation of scenes with multiple spectra, providing insights into the underlying materials and segmentation. We introduce an improved physically-based rend…
▽ More
We propose a novel cross-spectral rendering framework based on 3D Gaussian Splatting (3DGS) that generates realistic and semantically meaningful splats from registered multi-view spectrum and segmentation maps. This extension enhances the representation of scenes with multiple spectra, providing insights into the underlying materials and segmentation. We introduce an improved physically-based rendering approach for Gaussian splats, estimating reflectance and lights per spectra, thereby enhancing accuracy and realism. In a comprehensive quantitative and qualitative evaluation, we demonstrate the superior performance of our approach with respect to other recent learning-based spectral scene representation approaches (i.e., XNeRF and SpectralNeRF) as well as other non-spectral state-of-the-art learning-based approaches. Our work also demonstrates the potential of spectral scene understanding for precise scene editing techniques like style transfer, inpainting, and removal. Thereby, our contributions address challenges in multi-spectral scene representation, rendering, and editing, offering new possibilities for diverse applications.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Effect of low-temperature compression on superconductivity and crystal structure in strontium metal
Authors:
J. Lim,
S. Sinha,
D. E. Jackson,
R. S. Kumar,
C. Park,
R. J. Hemley,
D. VanGennep,
Y. K. Vohra,
R. G. Hennig,
P. J. Hirschfeld,
G. R. Stewart,
J. J. Hamlin
Abstract:
The superconducting and structural properties of elemental strontium metal were investigated under pressures up to 60 GPa while maintaining cryogenic conditions during pressure application. Applying pressure at low temperatures reveals differences in superconducting and structural phases compared to previous reports obtained at room temperatures. Notably, the superconducting critical temperature e…
▽ More
The superconducting and structural properties of elemental strontium metal were investigated under pressures up to 60 GPa while maintaining cryogenic conditions during pressure application. Applying pressure at low temperatures reveals differences in superconducting and structural phases compared to previous reports obtained at room temperatures. Notably, the superconducting critical temperature exhibits a twofold increase under compression after cryogenic cooling within the pressure range of 35-42 GPa, compared to cryogenic cooling after room-temperature compression. Subsequently, the transition width becomes significantly sharper above 42 GPa. Low-temperature X-ray diffraction measurements under pressure reveal that this change corresponds to the Sr-III to Sr-IV transition, with no evidence of any metastable structure. Furthermore, the monoclinic Sr-IV structure was observed to remain stable to much higher pressures - at least up to 60 GPa, without the appearance of the incommensurate Sr-V phase present at room temperature. This implies that thermal activation energy plays an important role in overcoming the presence of a kinetic barrier to the Sr-V phase at room temperature.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
CoLiDR: Concept Learning using Aggregated Disentangled Representations
Authors:
Sanchit Sinha,
Guangzhi Xiong,
Aidong Zhang
Abstract:
Interpretability of Deep Neural Networks using concept-based models offers a promising way to explain model behavior through human-understandable concepts. A parallel line of research focuses on disentangling the data distribution into its underlying generative factors, in turn explaining the data generation process. While both directions have received extensive attention, little work has been don…
▽ More
Interpretability of Deep Neural Networks using concept-based models offers a promising way to explain model behavior through human-understandable concepts. A parallel line of research focuses on disentangling the data distribution into its underlying generative factors, in turn explaining the data generation process. While both directions have received extensive attention, little work has been done on explaining concepts in terms of generative factors to unify mathematically disentangled representations and human-understandable concepts as an explanation for downstream tasks. In this paper, we propose a novel method CoLiDR - which utilizes a disentangled representation learning setup for learning mutually independent generative factors and subsequently learns to aggregate the said representations into human-understandable concepts using a novel aggregation/decomposition module. Experiments are conducted on datasets with both known and unknown latent generative factors. Our method successfully aggregates disentangled generative factors into concepts while maintaining parity with state-of-the-art concept-based approaches. Quantitative and visual analysis of the learned aggregation procedure demonstrates the advantages of our work compared to commonly used concept-based models over four challenging datasets. Lastly, our work is generalizable to an arbitrary number of concepts and generative factors - making it flexible enough to be suitable for various types of data.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
Investigating Metal Dopants for Lowering the Contact Resistance of Top Gold Contacted Monolayer MoS2
Authors:
Saurabh Kharwar,
Soham Sinha,
Tarun Kumar Agarwal
Abstract:
The interface properties between gold (Au) contacts and molybdenum disulfide (MoS2) are critical for optimizing the performance of semiconductor devices. This study investigates the impact of metal dopants (D) on the transport properties of MoS2 devices with top Au contacts, aiming to reduce contact resistance and enhance device performance. Using density functional theory (DFT) and non-equilibriu…
▽ More
The interface properties between gold (Au) contacts and molybdenum disulfide (MoS2) are critical for optimizing the performance of semiconductor devices. This study investigates the impact of metal dopants (D) on the transport properties of MoS2 devices with top Au contacts, aiming to reduce contact resistance and enhance device performance. Using density functional theory (DFT) and non-equilibrium Green's function (NEGF)- based first-principles calculations, we examine the structural, electronic, and quantum transport properties of Au-contacted, metal-doped MoS2. Our results indicate that Cd, Re, and Ru dopants significantly improve the structural stability and electronic properties of MoS2. Specifically, formation energy calculations show that Cd and Re are stable at hollow sites, while Ru prefers bond sites. Remarkably, Au-Ru-MoS2-based device exhibits tunnel resistance (RT ) up to 4.82 ohm-um. Furthermore, a dual-gated Au-Ru-MoS2 field effect transistor (FET) demonstrates an impressive Ion/Ioff ratio of 10^8 at Vgs of 2 V, highlighting its potential for nano-switching applications.
△ Less
Submitted 23 July, 2024; v1 submitted 21 July, 2024;
originally announced July 2024.
-
Dissipative chaos and steady state of open Tavis-Cummings dimer
Authors:
Debabrata Mondal,
Andrey Kolovsky,
S. Sinha
Abstract:
We consider a coupled atom-photon system described by the Tavis-Cummings dimer (two coupled cavities) in the presence of photon loss and atomic pumping, to investigate the quantum signature of dissipative chaos. The appropriate classical limit of the model allows us to obtain a phase diagram identifying different dynamical phases, especially the onset of chaos. Both classically and quantum mechani…
▽ More
We consider a coupled atom-photon system described by the Tavis-Cummings dimer (two coupled cavities) in the presence of photon loss and atomic pumping, to investigate the quantum signature of dissipative chaos. The appropriate classical limit of the model allows us to obtain a phase diagram identifying different dynamical phases, especially the onset of chaos. Both classically and quantum mechanically, we demonstrate the emergence of a steady state in the chaotic regime and analyze its properties. The interplay between quantum fluctuation and chaos leads to enhanced mixing dynamics and dephasing, resulting in the formation of an incoherent photonic fluid. The steady state exhibits an intriguing phenomenon of subsystem thermalization even outside the chaotic regime; however, its effective temperature increases with the degree of chaos. Moreover, the statistical properties of the steady state show a close connection with the random matrix theory. Finally, we discuss the experimental relevance of our findings, which can be tested in cavity and circuit quantum electrodynamics setups.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Situational Instructions Database: Task Guidance in Dynamic Environments
Authors:
Muhammad Saif Ullah Khan,
Sankalp Sinha,
Didier Stricker,
Muhammad Zeshan Afzal
Abstract:
The Situational Instructions Database (SID) addresses the need for enhanced situational awareness in artificial intelligence (AI) systems operating in dynamic environments. By integrating detailed scene graphs with dynamically generated, task-specific instructions, SID provides a novel dataset that allows AI systems to perform complex, real-world tasks with improved context sensitivity and operati…
▽ More
The Situational Instructions Database (SID) addresses the need for enhanced situational awareness in artificial intelligence (AI) systems operating in dynamic environments. By integrating detailed scene graphs with dynamically generated, task-specific instructions, SID provides a novel dataset that allows AI systems to perform complex, real-world tasks with improved context sensitivity and operational accuracy. This dataset leverages advanced generative models to simulate a variety of realistic scenarios based on the 3D Semantic Scene Graphs (3DSSG) dataset, enriching it with scenario-specific information that details environmental interactions and tasks. SID facilitates the development of AI applications that can adapt to new and evolving conditions without extensive retraining, supporting research in autonomous technology and AI-driven decision-making processes. This dataset is instrumental in developing robust, context-aware AI agents capable of effectively navigating and responding to unpredictable settings. Available for research and development, SID serves as a critical resource for advancing the capabilities of intelligent systems in complex environments. Dataset available at \url{https://github.com/mindgarage/situational-instructions-database}.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Quantum $K$-invariants via Quot schemes I
Authors:
Shubham Sinha,
Ming Zhang
Abstract:
We study the virtual Euler characteristics of sheaves over Quot schemes of curves, establishing that these invariants fit into a topological quantum field theory (TQFT) valued in $\mathbb{Z}[[q]]$. Utilizing Quot scheme compactifications alongside the TQFT framework, we derive presentations of the small quantum $K$-ring of the Grassmannian. Our approach offers a new method for finding explicit for…
▽ More
We study the virtual Euler characteristics of sheaves over Quot schemes of curves, establishing that these invariants fit into a topological quantum field theory (TQFT) valued in $\mathbb{Z}[[q]]$. Utilizing Quot scheme compactifications alongside the TQFT framework, we derive presentations of the small quantum $K$-ring of the Grassmannian. Our approach offers a new method for finding explicit formulas for quantum $K$-invariants.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges
Authors:
Darshan Deshpande,
Shambhavi Sinha,
Anirudh Ravi Kumar,
Debaditya Pal,
Jonathan May
Abstract:
Language Models have previously shown strong negotiation capabilities in closed domains where the negotiation strategy prediction scope is constrained to a specific setup. In this paper, we first show that these models are not generalizable beyond their original training domain despite their wide-scale pretraining. Following this, we propose an automated framework called GNOME, which processes exi…
▽ More
Language Models have previously shown strong negotiation capabilities in closed domains where the negotiation strategy prediction scope is constrained to a specific setup. In this paper, we first show that these models are not generalizable beyond their original training domain despite their wide-scale pretraining. Following this, we propose an automated framework called GNOME, which processes existing human-annotated, closed-domain datasets using Large Language Models and produces synthetic open-domain dialogues for negotiation. GNOME improves the generalizability of negotiation systems while reducing the expensive and subjective task of manual data curation. Through our experimental setup, we create a benchmark comparing encoder and decoder models trained on existing datasets against datasets created through GNOME. Our results show that models trained on our dataset not only perform better than previous state of the art models on domain specific strategy prediction, but also generalize better to previously unseen domains.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
QCQA: Quality and Capacity-aware grouped Query Attention
Authors:
Vinay Joshi,
Prashant Laddha,
Shambhavi Sinha,
Om Ji Omer,
Sreenivas Subramoney
Abstract:
Excessive memory requirements of key and value features (KV-cache) present significant challenges in the autoregressive inference of large language models (LLMs), restricting both the speed and length of text generation. Approaches such as Multi-Query Attention (MQA) and Grouped Query Attention (GQA) mitigate these challenges by grouping query heads and consequently reducing the number of correspo…
▽ More
Excessive memory requirements of key and value features (KV-cache) present significant challenges in the autoregressive inference of large language models (LLMs), restricting both the speed and length of text generation. Approaches such as Multi-Query Attention (MQA) and Grouped Query Attention (GQA) mitigate these challenges by grouping query heads and consequently reducing the number of corresponding key and value heads. However, MQA and GQA decrease the KV-cache size requirements at the expense of LLM accuracy (quality of text generation). These methods do not ensure an optimal tradeoff between KV-cache size and text generation quality due to the absence of quality-aware grouping of query heads. To address this issue, we propose Quality and Capacity-Aware Grouped Query Attention (QCQA), which identifies optimal query head groupings using an evolutionary algorithm with a computationally efficient and inexpensive fitness function. We demonstrate that QCQA achieves a significantly better tradeoff between KV-cache capacity and LLM accuracy compared to GQA. For the Llama2 $7\,$B model, QCQA achieves $\mathbf{20}$\% higher accuracy than GQA with similar KV-cache size requirements in the absence of fine-tuning. After fine-tuning both QCQA and GQA, for a similar KV-cache size, QCQA provides $\mathbf{10.55}\,$\% higher accuracy than GQA. Furthermore, QCQA requires $40\,$\% less KV-cache size than GQA to attain similar accuracy. The proposed quality and capacity-aware grouping of query heads can serve as a new paradigm for KV-cache optimization in autoregressive LLM inference.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
A Survey on Compositional Learning of AI Models: Theoretical and Experimetnal Practices
Authors:
Sania Sinha,
Tanawan Premsri,
Parisa Kordjamshidi
Abstract:
Compositional learning, mastering the ability to combine basic concepts and construct more intricate ones, is crucial for human cognition, especially in human language comprehension and visual perception. This notion is tightly connected to generalization over unobserved situations. Despite its integral role in intelligence, there is a lack of systematic theoretical and experimental research metho…
▽ More
Compositional learning, mastering the ability to combine basic concepts and construct more intricate ones, is crucial for human cognition, especially in human language comprehension and visual perception. This notion is tightly connected to generalization over unobserved situations. Despite its integral role in intelligence, there is a lack of systematic theoretical and experimental research methodologies, making it difficult to analyze the compositional learning abilities of computational models. In this paper, we survey the literature on compositional learning of AI models and the connections made to cognitive studies. We identify abstract concepts of compositionality in cognitive and linguistic studies and connect these to the computational challenges faced by language and vision models in compositional reasoning. We overview the formal definitions, tasks, evaluation benchmarks, variety of computational models, and theoretical findings. We cover modern studies on large language models to provide a deeper understanding of the cutting-edge compositional capabilities exhibited by state-of-the-art AI models and pinpoint important directions for future research.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Superconducting magic-angle twisted trilayer graphene hosts competing magnetic order and moiré inhomogeneities
Authors:
Ayshi Mukherjee,
Surat Layek,
Subhajit Sinha,
Ritajit Kundu,
Alisha H. Marchawala,
Mahesh Hingankar,
Joydip Sarkar,
L. D. Varma Sangani,
Heena Agarwal,
Sanat Ghosh,
Aya Batoul Tazi,
Kenji Watanabe,
Takashi Taniguchi,
Abhay N. Pasupathy,
Arijit Kundu,
Mandar M. Deshmukh
Abstract:
The microscopic mechanism of superconductivity in the magic-angle twisted graphene family, including magic-angle twisted trilayer graphene (MATTG), is poorly understood. Properties of MATTG, like Pauli limit violation, suggest unconventional superconductivity. Theoretical studies propose proximal magnetic states in the phase diagram, but direct experimental evidence is lacking. We show direct evid…
▽ More
The microscopic mechanism of superconductivity in the magic-angle twisted graphene family, including magic-angle twisted trilayer graphene (MATTG), is poorly understood. Properties of MATTG, like Pauli limit violation, suggest unconventional superconductivity. Theoretical studies propose proximal magnetic states in the phase diagram, but direct experimental evidence is lacking. We show direct evidence for an in-plane magnetic order proximal to the superconducting state using two complementary electrical transport measurements. First, we probe the superconducting phase by using statistically significant switching events from superconducting to the dissipative state of MATTG. The system behaves like a network of Josephson junctions due to lattice relaxation-induced moiré inhomogeneity in the system. We observe non-monotonic and hysteretic responses in the switching distributions as a function of temperature and in-plane magnetic field. Second, in normal regions doped slightly away from the superconducting regime, we observe hysteresis in magnetoresistance with an in-plane magnetic field; showing evidence for in-plane magnetic order that vanishes $\sim$900 mK. Additionally, we show a broadened Berezinskii-Kosterlitz-Thouless transition due to relaxation-induced moiré inhomogeneity. We find superfluid stiffness $J_{\mathrm{s}}$$\sim$0.15 K with strong temperature dependence. Theoretically, the magnetic and superconducting order arising from the magnetic order's fluctuations have been proposed - we show direct evidence for both. Our observation that the hysteretic magnetoresistance is sensitive to the in-plane field may constrain possible intervalley-coherent magnetic orders and the resulting superconductivity that arises from its fluctuations.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Authors:
Patrick Emami,
Zhaonan Li,
Saumya Sinha,
Truc Nguyen
Abstract:
Surrogate models are used to predict the behavior of complex energy systems that are too expensive to simulate with traditional numerical methods. Our work introduces the use of language descriptions, which we call "system captions" or SysCaps, to interface with such surrogates. We argue that interacting with surrogates through text, particularly natural language, makes these models more accessibl…
▽ More
Surrogate models are used to predict the behavior of complex energy systems that are too expensive to simulate with traditional numerical methods. Our work introduces the use of language descriptions, which we call "system captions" or SysCaps, to interface with such surrogates. We argue that interacting with surrogates through text, particularly natural language, makes these models more accessible for both experts and non-experts. We introduce a lightweight multimodal text and timeseries regression model and a training pipeline that uses large language models (LLMs) to synthesize high-quality captions from simulation metadata. Our experiments on two real-world simulators of buildings and wind farms show that our SysCaps-augmented surrogates have better accuracy on held-out systems than traditional methods while enjoying new generalization abilities, such as handling semantically related descriptions of the same test system. Additional experiments also highlight the potential of SysCaps to unlock language-driven design space exploration and to regularize training through prompt augmentation.
△ Less
Submitted 2 October, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Self-trapping phenomenon, multistability and chaos in open anisotropic Dicke dimer
Authors:
G. Vivek,
Debabrata Mondal,
Subhadeep Chakraborty,
S. Sinha
Abstract:
We investigate semiclassical dynamics of coupled atom-photon interacting system described by a dimer of anisotropic Dicke model in the presence of photon loss, exhibiting a rich variety of non-linear dynamics. Based on symmetries and dynamical classification, we characterize and chart out various dynamical phases in a phase diagram. A key feature of this system is the multistability of different d…
▽ More
We investigate semiclassical dynamics of coupled atom-photon interacting system described by a dimer of anisotropic Dicke model in the presence of photon loss, exhibiting a rich variety of non-linear dynamics. Based on symmetries and dynamical classification, we characterize and chart out various dynamical phases in a phase diagram. A key feature of this system is the multistability of different dynamical states, particularly the coexistence of various superradiant phases as well as limit cycles. Remarkably, this dimer system manifests self-trapping phenomena, resulting in a photon population imbalance between the cavities. Such a self-trapped state arises from saddle-node bifurcation, which can be understood from an equivalent Landau-Ginzburg description. Additionally, we identify a unique class of oscillatory dynamics self-trapped limit cycle hosting self-trapping of photons. The absence of stable dynamical phases leads to the onset of chaos, which is diagnosed using the saturation value of the decorrelator dynamics. Moreover, in a narrow region, the self-trapped states can coexist with chaotic attractor, which may have intriguing consequences in quantum dynamics. Finally, we discuss the experimental relevance of our findings, which can be tested in cavity and circuit quantum electrodynamics setups.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
MAML-en-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning
Authors:
Sanchit Sinha,
Yuguang Yue,
Victor Soto,
Mayank Kulkarni,
Jianhua Lu,
Aidong Zhang
Abstract:
Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have been proposed such as MetaICL and MetaICT, which involve meta-training pre-trained LLMs on a wide variety of diverse tasks. These meta-training approaches esse…
▽ More
Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have been proposed such as MetaICL and MetaICT, which involve meta-training pre-trained LLMs on a wide variety of diverse tasks. These meta-training approaches essentially perform in-context multi-task fine-tuning and evaluate on a disjointed test set of tasks. Even though they achieve impressive performance, their goal is never to compute a truly general set of parameters. In this paper, we propose MAML-en-LLM, a novel method for meta-training LLMs, which can learn truly generalizable parameters that not only perform well on disjointed tasks but also adapts to unseen tasks. We see an average increase of 2% on unseen domains in the performance while a massive 4% improvement on adaptation performance. Furthermore, we demonstrate that MAML-en-LLM outperforms baselines in settings with limited amount of training data on both seen and unseen domains by an average of 2%. Finally, we discuss the effects of type of tasks, optimizers and task complexity, an avenue barely explored in meta-training literature. Exhaustive experiments across 7 task settings along with two data settings demonstrate that models trained with MAML-en-LLM outperform SOTA meta-training approaches.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Tunable moiré materials for probing Berry physics and topology
Authors:
Pratap Chandra Adak,
Subhajit Sinha,
Amit Agarwal,
Mandar M. Deshmukh
Abstract:
Berry curvature physics and quantum geometric effects have been instrumental in advancing topological condensed matter physics in recent decades. Although Landau level-based flat bands and conventional 3D solids have been pivotal in exploring rich topological phenomena, they are constrained by their limited ability to undergo dynamic tuning. In stark contrast, moiré systems have risen as a versati…
▽ More
Berry curvature physics and quantum geometric effects have been instrumental in advancing topological condensed matter physics in recent decades. Although Landau level-based flat bands and conventional 3D solids have been pivotal in exploring rich topological phenomena, they are constrained by their limited ability to undergo dynamic tuning. In stark contrast, moiré systems have risen as a versatile platform for engineering bands and manipulating the distribution of Berry curvature in momentum space. These moiré systems not only harbor tunable topological bands, modifiable through a plethora of parameters, but also provide unprecedented access to large length scales and low energy scales. Furthermore, they offer unique opportunities stemming from the symmetry-breaking mechanisms and electron correlations associated with the underlying flat bands that are beyond the reach of conventional crystalline solids. A diverse array of tools, encompassing quantum electron transport in both linear and non-linear response regimes and optical excitation techniques, provide direct avenues for investigating Berry physics. This review navigates the evolving landscape of tunable moiré materials, highlighting recent experimental breakthroughs in the field of topological physics. Additionally, we delineate several challenges and offer insights into promising avenues for future research.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Pressure induced metallization and loss of surface magnetism in FeSi
Authors:
Yuhang Deng,
Farhad Taraporevala,
Haozhe Wang,
Eric Lee-Wong,
Camilla M. Moir,
Jinhyuk Lim,
Shubham Sinha,
Weiwei Xie,
James Hamlin,
Yogesh Vohra,
M. Brian Maple
Abstract:
Single crystalline FeSi samples with a conducting surface state (CSS) were studied under high pressure ($\textit{P}$) and magnetic field ($\textit{B}$) by means of electrical resistance ($\textit{R}$) measurements to explore how the bulk semiconducting state and the surface state are tuned by the application of pressure. We found that the energy gap ($Δ$) associated with the semiconducting bulk ph…
▽ More
Single crystalline FeSi samples with a conducting surface state (CSS) were studied under high pressure ($\textit{P}$) and magnetic field ($\textit{B}$) by means of electrical resistance ($\textit{R}$) measurements to explore how the bulk semiconducting state and the surface state are tuned by the application of pressure. We found that the energy gap ($Δ$) associated with the semiconducting bulk phase begins to close abruptly at a critical pressure ($P_{cr}$) of ~10 GPa and the bulk material becomes metallic with no obvious sign of any emergent phases or non-Fermi liquid behavior in $\textit{R}$($\textit{T}$) in the neighborhood of $P_{cr}$ above 3 K. Moreover, the metallic phase appears to remain at near-ambient pressure upon release of the pressure. Interestingly, the hysteresis in the $\textit{R}$($\textit{T}$) curve associated with the magnetically ordered CSS decreases with pressure and vanishes at $P_{cr}$, while the slope of the $\textit{R}$($\textit{B}$) curve, d$\textit{R}$/d$\textit{B}$, which has a negative value for $\textit{P}$ < $P_{cr}$, decreases in magnitude with $\textit{P}$ and changes sign at $P_{cr}$. Thus, the CSS and the corresponding two-dimensional magnetic order collapse at $P_{cr}$ where the energy gap $Δ$ of the bulk material starts to close abruptly, revealing the connection between the CSS and the semiconducting bulk state in FeSi.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification
Authors:
Sankalp Sinha,
Muhammad Saif Ullah Khan,
Talha Uddin Sheikh,
Didier Stricker,
Muhammad Zeshan Afzal
Abstract:
Zero-shot learning has been extensively investigated in the broader field of visual recognition, attracting significant interest recently. However, the current work on zero-shot learning in document image classification remains scarce. The existing studies either focus exclusively on zero-shot inference, or their evaluation does not align with the established criteria of zero-shot evaluation in th…
▽ More
Zero-shot learning has been extensively investigated in the broader field of visual recognition, attracting significant interest recently. However, the current work on zero-shot learning in document image classification remains scarce. The existing studies either focus exclusively on zero-shot inference, or their evaluation does not align with the established criteria of zero-shot evaluation in the visual recognition domain. We provide a comprehensive document image classification analysis in Zero-Shot Learning (ZSL) and Generalized Zero-Shot Learning (GZSL) settings to address this gap. Our methodology and evaluation align with the established practices of this domain. Additionally, we propose zero-shot splits for the RVL-CDIP dataset. Furthermore, we introduce CICA (pronounced 'ki-ka'), a framework that enhances the zero-shot learning capabilities of CLIP. CICA consists of a novel 'content module' designed to leverage any generic document-related textual information. The discriminative features extracted by this module are aligned with CLIP's text and image features using a novel 'coupled-contrastive' loss. Our module improves CLIP's ZSL top-1 accuracy by 6.7% and GZSL harmonic mean by 24% on the RVL-CDIP dataset. Our module is lightweight and adds only 3.3% more parameters to CLIP. Our work sets the direction for future research in zero-shot document classification.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Proliferation-driven mechanical feedback regulates cell dynamics in growing tissues
Authors:
Sumit Sinha,
Xin Li,
Abdul N Malmi-Kakkada,
D. Thirumalai
Abstract:
Local stresses in a tissue, a collective property, regulate cell division and apoptosis. In turn, cell growth and division induce active stresses in the tissue. As a consequence, there is a feedback between cell growth and local stresses. However, how the cell dynamics depend on local stress-dependent cell division and the feedback strength is not fully understood. Here, we probe the consequences…
▽ More
Local stresses in a tissue, a collective property, regulate cell division and apoptosis. In turn, cell growth and division induce active stresses in the tissue. As a consequence, there is a feedback between cell growth and local stresses. However, how the cell dynamics depend on local stress-dependent cell division and the feedback strength is not fully understood. Here, we probe the consequences of stress-mediated growth and cell division on cell dynamics using agent-based simulations of a two-dimensional growing tissue. We discover a rich dynamical behavior of individual cells, ranging from jamming (mean square displacement, $Δ(t) \sim t^α$ with $α$ less than unity), to hyperdiffusion ($α> 2$) depending on cell division rate and the strength of the mechanical feedback. Strikingly, $Δ(t)$ is determined by the tissue growth law, which quantifies cell proliferation (number of cells $N(t)$ as a function of time). The growth law ($N(t) \sim t^λ$ at long times) is regulated by the critical pressure that controls the strength of the mechanical feedback and the ratio between cell division-apoptosis rates. We show that $λ\sim α$, which implies that higher growth rate leads to a greater degree of cell migration. The variations in cell motility are linked to the emergence of highly persistent forces extending over several cell cycle times. Our predictions are testable using cell-tracking imaging techniques.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
A Self-explaining Neural Architecture for Generalizable Concept Learning
Authors:
Sanchit Sinha,
Guangzhi Xiong,
Aidong Zhang
Abstract:
With the wide proliferation of Deep Neural Networks in high-stake applications, there is a growing demand for explainability behind their decision-making process. Concept learning models attempt to learn high-level 'concepts' - abstract entities that align with human understanding, and thus provide interpretability to DNN architectures. However, in this paper, we demonstrate that present SOTA conc…
▽ More
With the wide proliferation of Deep Neural Networks in high-stake applications, there is a growing demand for explainability behind their decision-making process. Concept learning models attempt to learn high-level 'concepts' - abstract entities that align with human understanding, and thus provide interpretability to DNN architectures. However, in this paper, we demonstrate that present SOTA concept learning approaches suffer from two major problems - lack of concept fidelity wherein the models fail to learn consistent concepts among similar classes and limited concept interoperability wherein the models fail to generalize learned concepts to new domains for the same task. Keeping these in mind, we propose a novel self-explaining architecture for concept learning across domains which - i) incorporates a new concept saliency network for representative concept selection, ii) utilizes contrastive learning to capture representative domain invariant concepts, and iii) uses a novel prototype-based concept grounding regularization to improve concept alignment across domains. We demonstrate the efficacy of our proposed approach over current SOTA concept learning approaches on four widely used real-world datasets. Empirical results show that our method improves both concept fidelity measured through concept overlap and concept interoperability measured through domain adaptation performance.
△ Less
Submitted 5 May, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Decoherence of a charged Brownian particle in a magnetic field : an analysis of the roles of coupling via position and momentum variables
Authors:
Suraka Bhattacharjee,
Koushik Mandal,
Supurna Sinha
Abstract:
The study of decoherence plays a key role in our understanding of the transition from the quantum to the classical world. Typically, one considers a system coupled to an external bath which forms a model for an open quantum system. While most of the studies pertain to a position coupling between the system and the environment, some involve a momentum coupling, giving rise to an anomalous diffusive…
▽ More
The study of decoherence plays a key role in our understanding of the transition from the quantum to the classical world. Typically, one considers a system coupled to an external bath which forms a model for an open quantum system. While most of the studies pertain to a position coupling between the system and the environment, some involve a momentum coupling, giving rise to an anomalous diffusive model. Here we have gone beyond existing studies and analysed the quantum Langevin dynamics of a harmonically oscillating charged Brownian particle in the presence of a magnetic field and coupled to an Ohmic heat bath via both position and momentum couplings. The presence of both position and momentum couplings leads to a stronger interaction with the environment, resulting in a faster loss of coherence compared to a situation where only position coupling is present. The rate of decoherence can be tuned by controlling the relative strengths of the position and momentum coupling parameters. In addition, the magnetic field results in the slowing down of the loss of information from the system, irrespective of the nature of coupling between the system and the bath. Our results can be experimentally verified by designing a suitable ion trap setup.
△ Less
Submitted 22 April, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry
Authors:
Shiven Sinha,
Ameya Prabhu,
Ponnurangam Kumaraguru,
Siddharth Bhat,
Matthias Bethge
Abstract:
Proving geometric theorems constitutes a hallmark of visual reasoning combining both intuitive and logical skills. Therefore, automated theorem proving of Olympiad-level geometry problems is considered a notable milestone in human-level automated reasoning. The introduction of AlphaGeometry, a neuro-symbolic model trained with 100 million synthetic samples, marked a major breakthrough. It solved 2…
▽ More
Proving geometric theorems constitutes a hallmark of visual reasoning combining both intuitive and logical skills. Therefore, automated theorem proving of Olympiad-level geometry problems is considered a notable milestone in human-level automated reasoning. The introduction of AlphaGeometry, a neuro-symbolic model trained with 100 million synthetic samples, marked a major breakthrough. It solved 25 of 30 International Mathematical Olympiad (IMO) problems whereas the reported baseline based on Wu's method solved only ten. In this note, we revisit the IMO-AG-30 Challenge introduced with AlphaGeometry, and find that Wu's method is surprisingly strong. Wu's method alone can solve 15 problems, and some of them are not solved by any of the other methods. This leads to two key findings: (i) Combining Wu's method with the classic synthetic methods of deductive databases and angle, ratio, and distance chasing solves 21 out of 30 methods by just using a CPU-only laptop with a time limit of 5 minutes per problem. Essentially, this classic method solves just 4 problems less than AlphaGeometry and establishes the first fully symbolic baseline strong enough to rival the performance of an IMO silver medalist. (ii) Wu's method even solves 2 of the 5 problems that AlphaGeometry failed to solve. Thus, by combining AlphaGeometry with Wu's method we set a new state-of-the-art for automated theorem proving on IMO-AG-30, solving 27 out of 30 problems, the first AI method which outperforms an IMO gold medalist.
△ Less
Submitted 11 April, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Every Shot Counts: Using Exemplars for Repetition Counting in Videos
Authors:
Saptarshi Sinha,
Alexandros Stergiou,
Dima Damen
Abstract:
Video repetition counting infers the number of repetitions of recurring actions or motion within a video. We propose an exemplar-based approach that discovers visual correspondence of video exemplars across repetitions within target videos. Our proposed Every Shot Counts (ESCounts) model is an attention-based encoder-decoder that encodes videos of varying lengths alongside exemplars from the same…
▽ More
Video repetition counting infers the number of repetitions of recurring actions or motion within a video. We propose an exemplar-based approach that discovers visual correspondence of video exemplars across repetitions within target videos. Our proposed Every Shot Counts (ESCounts) model is an attention-based encoder-decoder that encodes videos of varying lengths alongside exemplars from the same and different videos. In training, ESCounts regresses locations of high correspondence to the exemplars within the video. In tandem, our method learns a latent that encodes representations of general repetitive motions, which we use for exemplar-free, zero-shot inference. Extensive experiments over commonly used datasets (RepCount, Countix, and UCFRep) showcase ESCounts obtaining state-of-the-art performance across all three datasets. Detailed ablations further demonstrate the effectiveness of our method.
△ Less
Submitted 13 October, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Effect of light-assisted tunable interaction on the position response function of cold atoms
Authors:
Anirban Misra,
Urbashi Satpathi,
Supurna Sinha,
Sanjukta Roy,
Saptarishi Chaudhuri
Abstract:
The position response of a particle subjected to a perturbation is of general interest in physics. We study the modification of the position response function of an ensemble of cold atoms in a magneto-optical trap in the presence of tunable light-assisted interactions. We subject the cold atoms to an intense laser light tuned near the photoassociation resonance and observe the position response of…
▽ More
The position response of a particle subjected to a perturbation is of general interest in physics. We study the modification of the position response function of an ensemble of cold atoms in a magneto-optical trap in the presence of tunable light-assisted interactions. We subject the cold atoms to an intense laser light tuned near the photoassociation resonance and observe the position response of the atoms subjected to a sudden displacement. Surprisingly, we observe that the entire cold atomic cloud undergoes collective oscillations. We use a generalised quantum Langevin approach to theoretically analyse the results of the experiments and find good agreement.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Evolution beats random chance: Performance-dependent network evolution for enhanced computational capacity
Authors:
Manish Yadav,
Sudeshna Sinha,
Merten Stender
Abstract:
The quest to understand structure-function relationships in networks across scientific disciplines has intensified. However, the optimal network architecture remains elusive, particularly for complex information processing. Therefore, we investigate how optimal and specific network structures form to efficiently solve distinct tasks using a novel framework of performance-dependent network evolutio…
▽ More
The quest to understand structure-function relationships in networks across scientific disciplines has intensified. However, the optimal network architecture remains elusive, particularly for complex information processing. Therefore, we investigate how optimal and specific network structures form to efficiently solve distinct tasks using a novel framework of performance-dependent network evolution, leveraging reservoir computing principles. Our study demonstrates that task-specific minimal network structures obtained through this framework consistently outperform networks generated by alternative growth strategies and Erdős-Rényi random networks. Evolved networks exhibit unexpected sparsity and adhere to scaling laws in node-density space while showcasing a distinctive asymmetry in input and information readout nodes distribution. Consequently, we propose a heuristic for quantifying task complexity from performance-dependently evolved networks, offering valuable insights into the evolutionary dynamics of network structure-function relationships. Our findings not only advance the fundamental understanding of process-specific network evolution but also shed light on the design and optimization of complex information processing mechanisms, notably in machine learning.
△ Less
Submitted 26 March, 2024; v1 submitted 23 March, 2024;
originally announced March 2024.
-
Prospects for measuring time variation of astrophysical neutrino sources at dark matter detectors
Authors:
Yi Zhuang,
Louis E. Strigari,
Lei Jin,
Samiran Sinha
Abstract:
We study the prospects for measuring the time variation of solar and atmospheric neutrino fluxes at future large-scale Xenon and Argon dark matter detectors. For solar neutrinos, a yearly time variation arises from the eccentricity of the Earth's orbit, and, for charged current interactions, from a smaller energy-dependent day-night variation to due flavor regeneration as neutrinos travel through…
▽ More
We study the prospects for measuring the time variation of solar and atmospheric neutrino fluxes at future large-scale Xenon and Argon dark matter detectors. For solar neutrinos, a yearly time variation arises from the eccentricity of the Earth's orbit, and, for charged current interactions, from a smaller energy-dependent day-night variation to due flavor regeneration as neutrinos travel through the Earth. For a 100-ton Xenon detector running for 10 years with a Xenon-136 fraction of $\lesssim 0.1\%$, in the electron recoil channel a time-variation amplitude of about 0.8\% is detectable with a power of 90\% and the level of significance of 10\%. This is sufficient to detect time variation due to eccentricity, which has amplitude of $\sim 3\%$. In the nuclear recoil channel, the detectable amplitude is about 10\% under current detector resolution and efficiency conditions, and this generally reduces to about 1\% for improved detector resolution and efficiency, the latter of which is sufficient to detect time variation due to eccentricity. Our analysis assumes both known and unknown periods. We provide scalings to determine the sensitivity to an arbitrary time-varying amplitude as a function of detector parameters. Identifying the time variation of the neutrino fluxes will be important for distinguishing neutrinos from dark matter signals and other detector-related backgrounds, and extracting properties of neutrinos that can be uniquely studied in dark matter experiments.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Prompting LLMs to Compose Meta-Review Drafts from Peer-Review Narratives of Scholarly Manuscripts
Authors:
Shubhra Kanti Karmaker Santu,
Sanjeev Kumar Sinha,
Naman Bansal,
Alex Knipper,
Souvika Sarkar,
John Salvador,
Yash Mahajan,
Sri Guttikonda,
Mousumi Akter,
Matthew Freestone,
Matthew C. Williams Jr
Abstract:
One of the most important yet onerous tasks in the academic peer-reviewing process is composing meta-reviews, which involves understanding the core contributions, strengths, and weaknesses of a scholarly manuscript based on peer-review narratives from multiple experts and then summarizing those multiple experts' perspectives into a concise holistic overview. Given the latest major developments in…
▽ More
One of the most important yet onerous tasks in the academic peer-reviewing process is composing meta-reviews, which involves understanding the core contributions, strengths, and weaknesses of a scholarly manuscript based on peer-review narratives from multiple experts and then summarizing those multiple experts' perspectives into a concise holistic overview. Given the latest major developments in generative AI, especially Large Language Models (LLMs), it is very compelling to rigorously study the utility of LLMs in generating such meta-reviews in an academic peer-review setting. In this paper, we perform a case study with three popular LLMs, i.e., GPT-3.5, LLaMA2, and PaLM2, to automatically generate meta-reviews by prompting them with different types/levels of prompts based on the recently proposed TELeR taxonomy. Finally, we perform a detailed qualitative study of the meta-reviews generated by the LLMs and summarize our findings and recommendations for prompting LLMs for this complex task.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Multi Agent Influence Diagrams for DeFi Governance
Authors:
Abhimanyu Nag,
Samrat Gupta,
Sudipan Sinha,
Arka Datta
Abstract:
Decentralized Finance (DeFi) governance models have become increasingly complex due to the involvement of numerous independent agents, each with their own incentives and strategies. To effectively analyze these systems, we propose using Multi Agent Influence Diagrams (MAIDs) as a powerful tool for modeling and studying the strategic interactions within DeFi governance. MAIDs allow for a comprehens…
▽ More
Decentralized Finance (DeFi) governance models have become increasingly complex due to the involvement of numerous independent agents, each with their own incentives and strategies. To effectively analyze these systems, we propose using Multi Agent Influence Diagrams (MAIDs) as a powerful tool for modeling and studying the strategic interactions within DeFi governance. MAIDs allow for a comprehensive representation of the decision-making processes of various agents, capturing the influence of their actions on one another and on the overall governance outcomes. In this paper, we study a simple governance game that approximates real governance protocols and compute the Nash equilibria using MAIDs. We further outline the structure of a MAID in MakerDAO.
△ Less
Submitted 15 October, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Television Discourse Decoded: Comprehensive Multimodal Analytics at Scale
Authors:
Anmol Agarwal,
Pratyush Priyadarshi,
Shiven Sinha,
Shrey Gupta,
Hitkul Jangra,
Ponnurangam Kumaraguru,
Kiran Garimella
Abstract:
In this paper, we tackle the complex task of analyzing televised debates, with a focus on a prime time news debate show from India. Previous methods, which often relied solely on text, fall short in capturing the multimodal essence of these debates. To address this gap, we introduce a comprehensive automated toolkit that employs advanced computer vision and speech-to-text techniques for large-scal…
▽ More
In this paper, we tackle the complex task of analyzing televised debates, with a focus on a prime time news debate show from India. Previous methods, which often relied solely on text, fall short in capturing the multimodal essence of these debates. To address this gap, we introduce a comprehensive automated toolkit that employs advanced computer vision and speech-to-text techniques for large-scale multimedia analysis. Utilizing state-of-the-art computer vision algorithms and speech-to-text methods, we transcribe, diarize, and analyze thousands of YouTube videos of a prime-time television debate show in India. These debates are a central part of Indian media but have been criticized for compromised journalistic integrity and excessive dramatization. Our toolkit provides concrete metrics to assess bias and incivility, capturing a comprehensive multimedia perspective that includes text, audio utterances, and video frames. Our findings reveal significant biases in topic selection and panelist representation, along with alarming levels of incivility. This work offers a scalable, automated approach for future research in multimedia analysis, with profound implications for the quality of public discourse and democratic debate. To catalyze further research in this area, we also release the code, dataset collected and supplemental pdf.
△ Less
Submitted 6 August, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
RanDumb: A Simple Approach that Questions the Efficacy of Continual Representation Learning
Authors:
Ameya Prabhu,
Shiven Sinha,
Ponnurangam Kumaraguru,
Philip H. S. Torr,
Ozan Sener,
Puneet K. Dokania
Abstract:
Continual learning has primarily focused on the issue of catastrophic forgetting and the associated stability-plasticity tradeoffs. However, little attention has been paid to the efficacy of continually learned representations, as representations are learned alongside classifiers throughout the learning process. Our primary contribution is empirically demonstrating that existing online continually…
▽ More
Continual learning has primarily focused on the issue of catastrophic forgetting and the associated stability-plasticity tradeoffs. However, little attention has been paid to the efficacy of continually learned representations, as representations are learned alongside classifiers throughout the learning process. Our primary contribution is empirically demonstrating that existing online continually trained deep networks produce inferior representations compared to a simple pre-defined random transforms. Our approach embeds raw pixels using a fixed random transform, approximating an RBF-Kernel initialized before any data is seen. We then train a simple linear classifier on top without storing any exemplars, processing one sample at a time in an online continual learning setting. This method, called RanDumb, significantly outperforms state-of-the-art continually learned representations across all standard online continual learning benchmarks. Our study reveals the significant limitations of representation learning, particularly in low-exemplar and online continual learning scenarios. Extending our investigation to popular exemplar-free scenarios with pretrained models, we find that training only a linear classifier on top of pretrained representations surpasses most continual fine-tuning and prompt-tuning strategies. Overall, our investigation challenges the prevailing assumptions about effective representation learning in online continual learning. Our code is available at://github.com/drimpossible/RanDumb.
△ Less
Submitted 23 July, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Engineering End-to-End Remote Labs using IoT-based Retrofitting
Authors:
K. S. Viswanadh,
Akshit Gureja,
Nagesh Walchatwar,
Rishabh Agrawal,
Shiven Sinha,
Sachin Chaudhari,
Karthik Vaidhyanathan,
Venkatesh Choppella,
Prabhakar Bhimalapuram,
Harikumar Kandath,
Aftab Hussain
Abstract:
Remote labs are a groundbreaking development in the education industry, providing students with access to laboratory education anytime, anywhere. However, most remote labs are costly and difficult to scale, especially in developing countries. With this as a motivation, this paper proposes a new remote labs (RLabs) solution that includes two use case experiments: Vanishing Rod and Focal Length. The…
▽ More
Remote labs are a groundbreaking development in the education industry, providing students with access to laboratory education anytime, anywhere. However, most remote labs are costly and difficult to scale, especially in developing countries. With this as a motivation, this paper proposes a new remote labs (RLabs) solution that includes two use case experiments: Vanishing Rod and Focal Length. The hardware experiments are built at a low-cost by retrofitting Internet of Things (IoT) components. They are also made portable by designing miniaturised and modular setups. The software architecture designed as part of the solution seamlessly supports the scalability of the experiments, offering compatibility with a wide range of hardware devices and IoT platforms. Additionally, it can live-stream remote experiments without needing dedicated server space for the stream. The software architecture also includes an automation suite that periodically checks the status of the experiments using computer vision (CV). RLabs is qualitatively evaluated against seven non-functional attributes - affordability, portability, scalability, compatibility, maintainability, usability, and universality. Finally, user feedback was collected from a group of students, and the scores indicate a positive response to the students' learning and the platform's usability.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Towards Deterministic End-to-end Latency for Medical AI Systems in NVIDIA Holoscan
Authors:
Soham Sinha,
Shekhar Dwivedi,
Mahdi Azizian
Abstract:
The introduction of AI and ML technologies into medical devices has revolutionized healthcare diagnostics and treatments. Medical device manufacturers are keen to maximize the advantages afforded by AI and ML by consolidating multiple applications onto a single platform. However, concurrent execution of several AI applications, each with its own visualization components, leads to unpredictable end…
▽ More
The introduction of AI and ML technologies into medical devices has revolutionized healthcare diagnostics and treatments. Medical device manufacturers are keen to maximize the advantages afforded by AI and ML by consolidating multiple applications onto a single platform. However, concurrent execution of several AI applications, each with its own visualization components, leads to unpredictable end-to-end latency, primarily due to GPU resource contentions. To mitigate this, manufacturers typically deploy separate workstations for distinct AI applications, thereby increasing financial, energy, and maintenance costs. This paper addresses these challenges within the context of NVIDIA's Holoscan platform, a real-time AI system for streaming sensor data and images. We propose a system design optimized for heterogeneous GPU workloads, encompassing both compute and graphics tasks. Our design leverages CUDA MPS for spatial partitioning of compute workloads and isolates compute and graphics processing onto separate GPUs. We demonstrate significant performance improvements across various end-to-end latency determinism metrics through empirical evaluation with real-world Holoscan medical device applications. For instance, the proposed design reduces maximum latency by 21-30% and improves latency distribution flatness by 17-25% for up to five concurrent endoscopy tool tracking AI applications, compared to a single-GPU baseline. Against a default multi-GPU setup, our optimizations decrease maximum latency by 35% for up to six concurrent applications by improving GPU utilization by 42%. This paper provides clear design insights for AI applications in the edge-computing domain including medical systems, where performance predictability of concurrent and heterogeneous GPU workloads is a critical requirement.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.