subscribe to arXiv mailings

Leveraging Internet Principles to Build a Quantum Network

Authors: Leonardo Bacciottini, Aparimit Chandra, Matheus Guedes De Andrade, Nitish K. Panigrahy, Shahrooz Pouryousef, Nageswara S. V. Rao, Emily Van Milligen, Gayane Vardoyan, Don Towsley

Abstract: Designing an operational architecture for the Quantum Internet is a challenging task in light of both fundamental limitations imposed by the laws of physics and technological constraints. Here, we propose a method to abstract away most of the quantum-specific elements and formulate a best-effort quantum network architecture based on packet-switching, akin to that of the classical Internet. Such re… ▽ More Designing an operational architecture for the Quantum Internet is a challenging task in light of both fundamental limitations imposed by the laws of physics and technological constraints. Here, we propose a method to abstract away most of the quantum-specific elements and formulate a best-effort quantum network architecture based on packet-switching, akin to that of the classical Internet. Such reframing provides an opportunity to exploit the many tools and protocols available and well-understood within the Internet. As an illustration, we tailor and adapt classical congestion control and active queue management protocols to quantum networks, comprising an architecture wherein quantum end- and intermediate nodes effectively regulate demand and resource utilization, respectively. Results show that these classical networking tools can be effectively used to combat quantum memory decoherence and keep end-to-end fidelity around a target value. △ Less

Submitted 11 October, 2024; originally announced October 2024.

Comments: 9 pages, 5 figures

arXiv:2410.04249 [pdf, other]

DiffSpec: Differential Testing with LLMs using Natural Language Specifications and Code Artifacts

Authors: Nikitha Rao, Elizabeth Gilbert, Tahina Ramananandro, Nikhil Swamy, Claire Le Goues, Sarah Fakhoury

Abstract: Differential testing can be an effective way to find bugs in software systems with multiple implementations that conform to the same specification, like compilers, network protocol parsers, and language runtimes. Specifications for such systems are often standardized in natural language documents, like Instruction Set Architecture (ISA) specifications, Wasm specifications or IETF RFC's. Large Lang… ▽ More Differential testing can be an effective way to find bugs in software systems with multiple implementations that conform to the same specification, like compilers, network protocol parsers, and language runtimes. Specifications for such systems are often standardized in natural language documents, like Instruction Set Architecture (ISA) specifications, Wasm specifications or IETF RFC's. Large Language Models (LLMs) have demonstrated potential in both generating tests and handling large volumes of natural language text, making them well-suited for utilizing artifacts like specification documents, bug reports, and code implementations. In this work, we leverage natural language and code artifacts to guide LLMs to generate targeted, meaningful tests that highlight meaningful behavioral differences between implementations, including those corresponding to bugs. We introduce DiffSpec, a framework for generating differential tests with LLMs using prompt chaining. We demonstrate the efficacy of DiffSpec on two different systems, namely, eBPF runtimes and Wasm validators. Using DiffSpec, we generated 359 differentiating tests, uncovering at least four distinct and confirmed bugs in eBPF, including a kernel memory leak, inconsistent behavior in jump instructions, and undefined behavior when using the stack pointer. We also found 279 differentiating tests in Wasm validators, that point to at least 2 confirmed and fixed bugs. △ Less

Submitted 5 October, 2024; originally announced October 2024.

arXiv:2409.16409 [pdf, ps, other]

Robust Mean Squared Prediction Error Estimators of EBLUP of a Small Area Mean Under the Fay-Herriot Model

Authors: Shijie Chen, P. Lahiri, J. N. K. Rao

Abstract: In this paper we derive a second-order unbiased (or nearly unbiased) mean squared prediction error (MSPE) estimator of empirical best linear unbiased predictor (EBLUP) of a small area mean for a non-normal extension to the well-known Fay-Herriot model. Specifically, we derive our MSPE estimator essentially assuming certain moment conditions on both the sampling and random effects distributions. Th… ▽ More In this paper we derive a second-order unbiased (or nearly unbiased) mean squared prediction error (MSPE) estimator of empirical best linear unbiased predictor (EBLUP) of a small area mean for a non-normal extension to the well-known Fay-Herriot model. Specifically, we derive our MSPE estimator essentially assuming certain moment conditions on both the sampling and random effects distributions. The normality-based Prasad-Rao MSPE estimator has a surprising robustness property in that it remains second-order unbiased under the non-normality of random effects when a simple method-of-moments estimator is used for the variance component and the sampling error distribution is normal. We show that the normality-based MSPE estimator is no longer second-order unbiased when the sampling error distribution is non-normal or when the Fay-Herriot moment method is used to estimate the variance component, even when the sampling error distribution is normal. It is interesting to note that when the simple method-of moments estimator is used for the variance component, our proposed MSPE estimator does not require the estimation of kurtosis of the random effects. Results of a simulation study on the accuracy of the proposed MSPE estimator, under non-normality of both sampling and random effects distributions, are also presented. △ Less

Submitted 24 September, 2024; originally announced September 2024.

arXiv:2409.15496 [pdf, other]

Continuous Variable Quantum Key Distribution with Single Quadrature Measurement at Arbitrary Reference Frame

Authors: Vinod N. Rao, Emma Tien Hwai Medlock, Timothy Spiller, Rupesh Kumar

Abstract: We propose a simplified measurement scheme for a Gaussian modulated coherent state (GMCS) protocol for continuous-variable quantum key distribution (CVQKD), utilizing homodyne detection without quadrature switching. The reference frame of measurement is taken to be at an arbitrary angle, however, reconciliation converges the proposed scheme to GMCS with switching quadrature protocol (GG02). The ar… ▽ More We propose a simplified measurement scheme for a Gaussian modulated coherent state (GMCS) protocol for continuous-variable quantum key distribution (CVQKD), utilizing homodyne detection without quadrature switching. The reference frame of measurement is taken to be at an arbitrary angle, however, reconciliation converges the proposed scheme to GMCS with switching quadrature protocol (GG02). The arbitrary frame of measurement could also include the unknown random thermal drift within Bob's optical measurement setup. We found this scheme is advantageous for practical free-space and fibre-based GMCS protocol based CVQKD systems as it does not require a phase modulator for random measurement selection quadrature at Bob. △ Less

Submitted 23 September, 2024; originally announced September 2024.

Comments: 8 pages, 4 figures

arXiv:2409.12447 [pdf, other]

Prompts Are Programs Too! Understanding How Developers Build Software Containing Prompts

Authors: Jenny T. Liang, Melissa Lin, Nikitha Rao, Brad A. Myers

Abstract: The introduction of generative pre-trained models, like GPT-4, has introduced a phenomenon known as prompt engineering, whereby model users repeatedly write and revise prompts while trying to achieve a task. Using these AI models for intelligent features in software applications require using APIs that are controlled through developer-written prompts. These prompts have powered AI experiences in p… ▽ More The introduction of generative pre-trained models, like GPT-4, has introduced a phenomenon known as prompt engineering, whereby model users repeatedly write and revise prompts while trying to achieve a task. Using these AI models for intelligent features in software applications require using APIs that are controlled through developer-written prompts. These prompts have powered AI experiences in popular software products, potentially reaching millions of users. Despite the growing impact of prompt-powered software, little is known about its development process and its relationship to programming. In this work, we argue that some forms of prompts are programs, and that the development of prompts is a distinct phenomenon in programming. We refer to this phenomenon as prompt programming. To this end, we develop an understanding of prompt programming using Straussian grounded theory through interviews with 20 developers engaged in prompt development across a variety of contexts, models, domains, and prompt complexities. Through this study, we contribute 14 observations about prompt programming. For example, rather than building mental models of code, prompt programmers develop mental models of the FM's behavior on the prompt and its unique qualities by interacting with the model. While prior research has shown that experts have well-formed mental models, we find that prompt programmers who have developed dozens of prompts, each with many iterations, still struggle to develop reliable mental models. This contributes to a rapid and unsystematic development process. Taken together, our observations indicate that prompt programming is significantly different from traditional software development, motivating the creation of tools to support prompt programming. Our findings have implications for software engineering practitioners, educators, and researchers. △ Less

Submitted 18 September, 2024; originally announced September 2024.

arXiv:2409.11238 [pdf, other]

Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems

Authors: Jake Welde, Nishanth Rao, Pratik Kunapuli, Dinesh Jayaraman, Vijay Kumar

Abstract: Tracking controllers enable robotic systems to accurately follow planned reference trajectories. In particular, reinforcement learning (RL) has shown promise in the synthesis of controllers for systems with complex dynamics and modest online compute budgets. However, the poor sample efficiency of RL and the challenges of reward design make training slow and sometimes unstable, especially for high-… ▽ More Tracking controllers enable robotic systems to accurately follow planned reference trajectories. In particular, reinforcement learning (RL) has shown promise in the synthesis of controllers for systems with complex dynamics and modest online compute budgets. However, the poor sample efficiency of RL and the challenges of reward design make training slow and sometimes unstable, especially for high-dimensional systems. In this work, we leverage the inherent Lie group symmetries of robotic systems with a floating base to mitigate these challenges when learning tracking controllers. We model a general tracking problem as a Markov decision process (MDP) that captures the evolution of both the physical and reference states. Next, we prove that symmetry in the underlying dynamics and running costs leads to an MDP homomorphism, a mapping that allows a policy trained on a lower-dimensional "quotient" MDP to be lifted to an optimal tracking controller for the original system. We compare this symmetry-informed approach to an unstructured baseline, using Proximal Policy Optimization (PPO) to learn tracking controllers for three systems: the Particle (a forced point mass), the Astrobee (a fullyactuated space robot), and the Quadrotor (an underactuated system). Results show that a symmetry-aware approach both accelerates training and reduces tracking error after the same number of training steps. △ Less

Submitted 17 September, 2024; originally announced September 2024.

Comments: The first three authors contributed equally to this work

arXiv:2409.05386 [pdf, other]

Predictive Coding with Spiking Neural Networks: a Survey

Authors: Antony W. N'dri, William Gebhardt, Céline Teulière, Fleur Zeldenrust, Rajesh P. N. Rao, Jochen Triesch, Alexander Ororbia

Abstract: In this article, we review a class of neuro-mimetic computational models that we place under the label of spiking predictive coding. Specifically, we review the general framework of predictive processing in the context of neurons that emit discrete action potentials, i.e., spikes. Theoretically, we structure our survey around how prediction errors are represented, which results in an organization… ▽ More In this article, we review a class of neuro-mimetic computational models that we place under the label of spiking predictive coding. Specifically, we review the general framework of predictive processing in the context of neurons that emit discrete action potentials, i.e., spikes. Theoretically, we structure our survey around how prediction errors are represented, which results in an organization of historical neuromorphic generalizations that is centered around three broad classes of approaches: prediction errors in explicit groups of error neurons, in membrane potentials, and implicit prediction error encoding. Furthermore, we examine some applications of spiking predictive coding that utilize more energy-efficient, edge-computing hardware platforms. Finally, we highlight important future directions and challenges in this emerging line of inquiry in brain-inspired computing. Building on the prior results of work in computational cognitive neuroscience, machine intelligence, and neuromorphic engineering, we hope that this review of neuromorphic formulations and implementations of predictive coding will encourage and guide future research and development in this emerging research area. △ Less

Submitted 9 September, 2024; originally announced September 2024.

arXiv:2409.00737 [pdf, other]

doi 10.1145/3678884.3681829

Data Collectives as a means to Improve Accountability, Combat Surveillance and Reduce Inequalities

Authors: Jane Hsieh, Angie Zhang, Seyun Kim, Varun Nagaraj Rao, Samantha Dalal, Alexandra Mateescu, Rafael Do Nascimento Grohmann, Motahhare Eslami, Min Kyung Lee, Haiyi Zhu

Abstract: Platform-based laborers face unprecedented challenges and working conditions that result from algorithmic opacity, insufficient data transparency, and unclear policies and regulations. The CSCW and HCI communities increasingly turn to worker data collectives as a means to advance related policy and regulation, hold platforms accountable for data transparency and disclosure, and empower the collect… ▽ More Platform-based laborers face unprecedented challenges and working conditions that result from algorithmic opacity, insufficient data transparency, and unclear policies and regulations. The CSCW and HCI communities increasingly turn to worker data collectives as a means to advance related policy and regulation, hold platforms accountable for data transparency and disclosure, and empower the collective worker voice. However, fundamental questions remain for designing, governing and sustaining such data infrastructures. In this workshop, we leverage frameworks such as data feminism to design sustainable and power-aware data collectives that tackle challenges present in various types of online labor platforms (e.g., ridesharing, freelancing, crowdwork, carework). While data collectives aim to support worker collectives and complement relevant policy initiatives, the goal of this workshop is to encourage their designers to consider topics of governance, privacy, trust, and transparency. In this one-day session, we convene research and advocacy community members to reflect on critical platform work issues (e.g., worker surveillance, discrimination, wage theft, insufficient platform accountability) as well as to collaborate on codesigning data collectives that ethically and equitably address these concerns by supporting working collectivism and informing policy development. △ Less

Submitted 1 September, 2024; originally announced September 2024.

arXiv:2408.17224 [pdf, other]

Hadronic cross section measurements with the DAMPE space mission using 20GeV-10TeV cosmic-ray protons and $^4$He

Authors: F. Alemanno, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, I. Cagnoli, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, P. Coppin, M. Y. Cui, T. S. Cui, Y. X. Cui, H. T. Dai, A. De Benedittis, I. De Mitri, F. de Palma, A. Di Giovanni, Q. Ding, T. K. Dong , et al. (126 additional authors not shown)

Abstract: Precise direct cosmic-ray (CR) measurements provide an important probe to study the energetic particle sources in our Galaxy, and the interstellar environment through which these particles propagate. Uncertainties on hadronic models, ion-nucleon cross sections in particular, are currently the limiting factor towards obtaining more accurate CR ion flux measurements with calorimetric space-based exp… ▽ More Precise direct cosmic-ray (CR) measurements provide an important probe to study the energetic particle sources in our Galaxy, and the interstellar environment through which these particles propagate. Uncertainties on hadronic models, ion-nucleon cross sections in particular, are currently the limiting factor towards obtaining more accurate CR ion flux measurements with calorimetric space-based experiments. We present an energy-dependent measurement of the inelastic cross section of protons and helium-4 nuclei (alpha particles) on a Bi$_4$Ge$_3$O$_{12}$ target, using 88 months of data collected by the DAMPE space mission. The kinetic energy range per nucleon of the measurement points ranges from 18 GeV to 9 TeV for protons, and from 5 GeV/n to 3 TeV/n for helium-4 nuclei. Our results lead to a significant improvement of the CR flux normalisation. In the case of helium-4, these results correspond to the first cross section measurements on a heavy target material at energies above 10 GeV/n. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 17 pages, submitted to PRD

arXiv:2407.18919 [pdf]

Accelerating Drug Safety Assessment using Bidirectional-LSTM for SMILES Data

Authors: K. Venkateswara Rao, Kunjam Nageswara Rao, G. Sita Ratnam

Abstract: Computational methods are useful in accelerating the pace of drug discovery. Drug discovery carries several steps such as target identification and validation, lead discovery, and lead optimisation etc., In the phase of lead optimisation, the absorption, distribution, metabolism, excretion, and toxicity properties of lead compounds are assessed. To address the issue of predicting toxicity and solu… ▽ More Computational methods are useful in accelerating the pace of drug discovery. Drug discovery carries several steps such as target identification and validation, lead discovery, and lead optimisation etc., In the phase of lead optimisation, the absorption, distribution, metabolism, excretion, and toxicity properties of lead compounds are assessed. To address the issue of predicting toxicity and solubility in the lead compounds, represented in Simplified Molecular Input Line Entry System (SMILES) notation. Among the different approaches that work on SMILES data, the proposed model was built using a sequence-based approach. The proposed Bi-Directional Long Short Term Memory (BiLSTM) is a variant of Recurrent Neural Network (RNN) that processes input molecular sequences for the comprehensive examination of the structural features of molecules from both forward and backward directions. The proposed work aims to understand the sequential patterns encoded in the SMILES strings, which are then utilised for predicting the toxicity of the molecules. The proposed model on the ClinTox dataset surpasses previous approaches such as Trimnet and Pre-training Graph neural networks(GNN) by achieving a ROC accuracy of 0.96. BiLSTM outperforms the previous model on FreeSolv dataset with a low RMSE value of 1.22 in solubility prediction. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 10 pages

arXiv:2406.19150 [pdf, other]

RAVEN: Multitask Retrieval Augmented Vision-Language Learning

Authors: Varun Nagaraj Rao, Siddharth Choudhary, Aditya Deshpande, Ravi Kumar Satzoda, Srikar Appalaraju

Abstract: The scaling of large language models to encode all the world's knowledge in model parameters is unsustainable and has exacerbated resource barriers. Retrieval-Augmented Generation (RAG) presents a potential solution, yet its application to vision-language models (VLMs) is under explored. Existing methods focus on models designed for single tasks. Furthermore, they're limited by the need for resour… ▽ More The scaling of large language models to encode all the world's knowledge in model parameters is unsustainable and has exacerbated resource barriers. Retrieval-Augmented Generation (RAG) presents a potential solution, yet its application to vision-language models (VLMs) is under explored. Existing methods focus on models designed for single tasks. Furthermore, they're limited by the need for resource intensive pre training, additional parameter requirements, unaddressed modality prioritization and lack of clear benefit over non-retrieval baselines. This paper introduces RAVEN, a multitask retrieval augmented VLM framework that enhances base VLMs through efficient, task specific fine-tuning. By integrating retrieval augmented samples without the need for additional retrieval-specific parameters, we show that the model acquires retrieval properties that are effective across multiple tasks. Our results and extensive ablations across retrieved modalities for the image captioning and VQA tasks indicate significant performance improvements compared to non retrieved baselines +1 CIDEr on MSCOCO, +4 CIDEr on NoCaps and nearly a +3\% accuracy on specific VQA question types. This underscores the efficacy of applying RAG approaches to VLMs, marking a stride toward more efficient and accessible multimodal learning. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.10768 [pdf, other]

Rideshare Transparency: Translating Gig Worker Insights on AI Platform Design to Policy

Authors: Varun Nagaraj Rao, Samantha Dalal, Eesha Agarwal, Dana Calacci, Andrés Monroy-Hernández

Abstract: Rideshare platforms exert significant control over workers through algorithmic systems that can result in financial, emotional, and physical harm. What steps can platforms, designers, and practitioners take to mitigate these negative impacts and meet worker needs? In this paper, through a novel mixed methods study combining a LLM-based analysis of over 1 million comments posted to online platform… ▽ More Rideshare platforms exert significant control over workers through algorithmic systems that can result in financial, emotional, and physical harm. What steps can platforms, designers, and practitioners take to mitigate these negative impacts and meet worker needs? In this paper, through a novel mixed methods study combining a LLM-based analysis of over 1 million comments posted to online platform worker communities with semi-structured interviews of workers, we thickly characterize transparency-related harms, mitigation strategies, and worker needs while validating and contextualizing our findings within the broader worker community. Our findings expose a transparency gap between existing platform designs and the information drivers need, particularly concerning promotions, fares, routes, and task allocation. Our analysis suggests that rideshare workers need key pieces of information, which we refer to as indicators, to make informed work decisions. These indicators include details about rides, driver statistics, algorithmic implementation details, and platform policy information. We argue that instead of relying on platforms to include such information in their designs, new regulations that require platforms to publish public transparency reports may be a more effective solution to improve worker well-being. We offer recommendations for implementing such a policy. △ Less

Submitted 19 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.07667 [pdf, other]

PLT-D3: A High-fidelity Dynamic Driving Simulation Dataset for Stereo Depth and Scene Flow

Authors: Joshua Tokarsky, Ibrahim Abdulhafiz, Satya Ayyalasomayajula, Mostafa Mohsen, Navya G. Rao, Adam Forbes

Abstract: Autonomous driving has experienced remarkable progress, bolstered by innovations in computational hardware and sophisticated deep learning methodologies. The foundation of these advancements rests on the availability and quality of datasets, which are crucial for the development and refinement of dependable and versatile autonomous driving algorithms. While numerous datasets have been developed to… ▽ More Autonomous driving has experienced remarkable progress, bolstered by innovations in computational hardware and sophisticated deep learning methodologies. The foundation of these advancements rests on the availability and quality of datasets, which are crucial for the development and refinement of dependable and versatile autonomous driving algorithms. While numerous datasets have been developed to support the evolution of autonomous driving perception technologies, few offer the diversity required to thoroughly test and enhance system robustness under varied weather conditions. Many public datasets lack the comprehensive coverage of challenging weather scenarios and detailed, high-resolution data, which are critical for training and validating advanced autonomous-driving perception models. In this paper, we introduce PLT-D3; a Dynamic-weather Driving Dataset, designed specifically to enhance autonomous driving systems' adaptability to diverse weather conditions. PLT-D3 provides high-fidelity stereo depth and scene flow ground truth data generated using Unreal Engine 5. In particular, this dataset includes synchronized high-resolution stereo image sequences that replicate a wide array of dynamic weather scenarios including rain, snow, fog, and diverse lighting conditions, offering an unprecedented level of realism in simulation-based testing. The primary aim of PLT-D3 is to address the scarcity of comprehensive training and testing resources that can simulate real-world weather variations. Benchmarks have been established for several critical autonomous driving tasks using PLT-D3, such as depth estimation, optical flow and scene-flow to measure and enhance the performance of state-of-the-art models. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2405.10391 [pdf, other]

Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance

Authors: Anish Bhattacharya, Nishanth Rao, Dhruv Parikh, Pratik Kunapuli, Yuwei Wu, Yuezhan Tao, Nikolai Matni, Vijay Kumar

Abstract: We demonstrate the capabilities of an attention-based end-to-end approach for high-speed vision-based quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art learning architectures. Quadrotor unmanned aerial vehicles (UAVs) have tremendous maneuverability when flown fast; however, as flight speed increases, traditional model-based approaches to na… ▽ More We demonstrate the capabilities of an attention-based end-to-end approach for high-speed vision-based quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art learning architectures. Quadrotor unmanned aerial vehicles (UAVs) have tremendous maneuverability when flown fast; however, as flight speed increases, traditional model-based approaches to navigation via independent perception, mapping, planning, and control modules breaks down due to increased sensor noise, compounding errors, and increased processing latency. Thus, learning-based, end-to-end vision-to-control networks have shown to have great potential for online control of these fast robots through cluttered environments. We train and compare convolutional, U-Net, and recurrent architectures against vision transformer (ViT) models for depth image-to-control in high-fidelity simulation, observing that ViT models are more effective than others as quadrotor speeds increase and in generalization to unseen environments, while the addition of recurrence further improves performance while reducing quadrotor energy cost across all tested flight speeds. We assess performance at speeds of up to 7m/s in simulation and hardware. To the best of our knowledge, this is the first work to utilize vision transformers for end-to-end vision-based quadrotor control. △ Less

Submitted 27 September, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: 11 pages, 18 figures, 3 tables (with supplementary)

arXiv:2405.07526 [pdf, other]

doi 10.1145/3589335.3648327

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Authors: Qi Chen, Xiubo Geng, Corby Rosset, Carolyn Buractaon, Jingwen Lu, Tao Shen, Kun Zhou, Chenyan Xiong, Yeyun Gong, Paul Bennett, Nick Craswell, Xing Xie, Fan Yang, Bryan Tower, Nikhil Rao, Anlei Dong, Wenqi Jiang, Zheng Liu, Mingqin Li, Chuanjie Liu, Zengzhong Li, Rangan Majumder, Jennifer Neville, Andy Oakley, Knut Magne Risvik , et al. (6 additional authors not shown)

Abstract: Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked query-document labels. This dataset closely mimics real-world web document and query distribution, provides rich information for various kinds of down… ▽ More Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked query-document labels. This dataset closely mimics real-world web document and query distribution, provides rich information for various kinds of downstream tasks and encourages research in various areas, such as generic end-to-end neural indexer models, generic embedding models, and next generation information access system with large language models. MS MARCO Web Search offers a retrieval benchmark with three web retrieval challenge tasks that demand innovations in both machine learning and information retrieval system research domains. As the first dataset that meets large, real and rich data requirements, MS MARCO Web Search paves the way for future advancements in AI and system research. MS MARCO Web Search dataset is available at: https://github.com/microsoft/MS-MARCO-Web-Search. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 10 pages, 6 figures, for associated dataset, see http://github.com/microsoft/MS-MARCO-Web-Search

arXiv:2405.05345 [pdf, other]

QuaLLM: An LLM-based Framework to Extract Quantitative Insights from Online Forums

Authors: Varun Nagaraj Rao, Eesha Agarwal, Samantha Dalal, Dan Calacci, Andrés Monroy-Hernández

Abstract: Online discussion forums provide crucial data to understand the concerns of a wide range of real-world communities. However, the typical qualitative and quantitative methods used to analyze those data, such as thematic analysis and topic modeling, are infeasible to scale or require significant human effort to translate outputs to human readable forms. This study introduces QuaLLM, a novel LLM-base… ▽ More Online discussion forums provide crucial data to understand the concerns of a wide range of real-world communities. However, the typical qualitative and quantitative methods used to analyze those data, such as thematic analysis and topic modeling, are infeasible to scale or require significant human effort to translate outputs to human readable forms. This study introduces QuaLLM, a novel LLM-based framework to analyze and extract quantitative insights from text data on online forums. The framework consists of a novel prompting methodology and evaluation strategy. We applied this framework to analyze over one million comments from two Reddit's rideshare worker communities, marking the largest study of its type. We uncover significant worker concerns regarding AI and algorithmic platform decisions, responding to regulatory calls about worker insights. In short, our work sets a new precedent for AI-assisted quantitative data analysis to surface concerns from online forums. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted to CHI LLM as Research Tools Workshop (2024)

arXiv:2405.00820 [pdf, other]

HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond

Authors: Stefan Abi-Karam, Rishov Sarkar, Allison Seigler, Sean Lowe, Zhigang Wei, Hanqiu Chen, Nanditha Rao, Lizy John, Aman Arora, Cong Hao

Abstract: Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present challenges. Existing datasets have limitations in terms of benchmark coverage, design space enumeration, vendor extens… ▽ More Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present challenges. Existing datasets have limitations in terms of benchmark coverage, design space enumeration, vendor extensibility, or lack of reproducible and extensible software for dataset construction. Many works also lack user-friendly ways to add more designs, limiting wider adoption of such datasets. In response to these challenges, we introduce HLSFactory, a comprehensive framework designed to facilitate the curation and generation of high-quality HLS design datasets. HLSFactory has three main stages: 1) a design space expansion stage to elaborate single HLS designs into large design spaces using various optimization directives across multiple vendor tools, 2) a design synthesis stage to execute HLS and FPGA tool flows concurrently across designs, and 3) a data aggregation stage for extracting standardized data into packaged datasets for ML usage. This tripartite architecture ensures broad design space coverage via design space expansion and supports multiple vendor tools. Users can contribute to each stage with their own HLS designs and synthesis results and extend the framework itself with custom frontends and tool flows. We also include an initial set of built-in designs from common HLS benchmarks curated open-source HLS designs. We showcase the versatility and multi-functionality of our framework through six case studies: I) Design space sampling; II) Fine-grained parallelism backend speedup; III) Targeting Intel's HLS flow; IV) Adding new auxiliary designs; V) Integrating published HLS data; VI) HLS tool version regression benchmarking. Code at https://github.com/sharc-lab/HLSFactory. △ Less

Submitted 17 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

Comments: Edit to "Section V.E" for proper attribution of open-source HLSyn, AutoDSE, and the Merlin compiler

arXiv:2402.17896 [pdf, other]

Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents

Authors: Corby Rosset, Ho-Lam Chung, Guanghui Qin, Ethan C. Chau, Zhuo Feng, Ahmed Awadallah, Jennifer Neville, Nikhil Rao

Abstract: Existing question answering (QA) datasets are no longer challenging to most powerful Large Language Models (LLMs). Traditional QA benchmarks like TriviaQA, NaturalQuestions, ELI5 and HotpotQA mainly study ``known unknowns'' with clear indications of both what information is missing, and how to find it to answer the question. Hence, good performance on these benchmarks provides a false sense of sec… ▽ More Existing question answering (QA) datasets are no longer challenging to most powerful Large Language Models (LLMs). Traditional QA benchmarks like TriviaQA, NaturalQuestions, ELI5 and HotpotQA mainly study ``known unknowns'' with clear indications of both what information is missing, and how to find it to answer the question. Hence, good performance on these benchmarks provides a false sense of security. A yet unmet need of the NLP community is a bank of non-factoid, multi-perspective questions involving a great deal of unclear information needs, i.e. ``unknown uknowns''. We claim we can find such questions in search engine logs, which is surprising because most question-intent queries are indeed factoid. We present Researchy Questions, a dataset of search engine queries tediously filtered to be non-factoid, ``decompositional'' and multi-perspective. We show that users spend a lot of ``effort'' on these questions in terms of signals like clicks and session length, and that they are also challenging for GPT-4. We also show that ``slow thinking'' answering techniques, like decomposition into sub-questions shows benefit over answering directly. We release $\sim$ 100k Researchy Questions, along with the Clueweb22 URLs that were clicked. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2312.17479 [pdf, other]

Culturally-Attuned Moral Machines: Implicit Learning of Human Value Systems by AI through Inverse Reinforcement Learning

Authors: Nigini Oliveira, Jasmine Li, Koosha Khalvati, Rodolfo Cortes Barragan, Katharina Reinecke, Andrew N. Meltzoff, Rajesh P. N. Rao

Abstract: Constructing a universal moral code for artificial intelligence (AI) is difficult or even impossible, given that different human cultures have different definitions of morality and different societal norms. We therefore argue that the value system of an AI should be culturally attuned: just as a child raised in a particular culture learns the specific values and norms of that culture, we propose t… ▽ More Constructing a universal moral code for artificial intelligence (AI) is difficult or even impossible, given that different human cultures have different definitions of morality and different societal norms. We therefore argue that the value system of an AI should be culturally attuned: just as a child raised in a particular culture learns the specific values and norms of that culture, we propose that an AI agent operating in a particular human community should acquire that community's moral, ethical, and cultural codes. How AI systems might acquire such codes from human observation and interaction has remained an open question. Here, we propose using inverse reinforcement learning (IRL) as a method for AI agents to acquire a culturally-attuned value system implicitly. We test our approach using an experimental paradigm in which AI agents use IRL to learn different reward functions, which govern the agents' moral values, by observing the behavior of different cultural groups in an online virtual world requiring real-time decision making. We show that an AI agent learning from the average behavior of a particular cultural group can acquire altruistic characteristics reflective of that group's behavior, and this learned value system can generalize to new scenarios requiring altruistic judgments. Our results provide, to our knowledge, the first demonstration that AI agents could potentially be endowed with the ability to continually learn their values and norms from observing and interacting with humans, thereby becoming attuned to the culture they are operating in. △ Less

Submitted 29 December, 2023; originally announced December 2023.

arXiv:2312.10049 [pdf]

Knowledge Graph Reasoning Based on Attention GCN

Authors: Meera Gupta, Ravi Khanna, Divya Choudhary, Nandini Rao

Abstract: We propose a novel technique to enhance Knowledge Graph Reasoning by combining Graph Convolution Neural Network (GCN) with the Attention Mechanism. This approach utilizes the Attention Mechanism to examine the relationships between entities and their neighboring nodes, which helps to develop detailed feature vectors for each entity. The GCN uses shared parameters to effectively represent the chara… ▽ More We propose a novel technique to enhance Knowledge Graph Reasoning by combining Graph Convolution Neural Network (GCN) with the Attention Mechanism. This approach utilizes the Attention Mechanism to examine the relationships between entities and their neighboring nodes, which helps to develop detailed feature vectors for each entity. The GCN uses shared parameters to effectively represent the characteristics of adjacent entities. We first learn the similarity of entities for node representation learning. By integrating the attributes of the entities and their interactions, this method generates extensive implicit feature vectors for each entity, improving performance in tasks including entity classification and link prediction, outperforming traditional neural network models. To conclude, this work provides crucial methodological support for a range of applications, such as search engines, question-answering systems, recommendation systems, and data integration tasks. △ Less

Submitted 27 January, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

arXiv:2312.01744 [pdf, other]

doi 10.1109/WASPAA58266.2023.10248144

SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement

Authors: Martin Strauss, Nicola Pia, Nagashree K. S. Rao, Bernd Edler

Abstract: This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE). For this, a DNN is trained to synthesize the enhanced speech conditioned on noisy speech using a Normalizing Flow (NF) as generator in a GAN framework. While the combination of likelihood models and GANs is not trivial, SEFG… ▽ More This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE). For this, a DNN is trained to synthesize the enhanced speech conditioned on noisy speech using a Normalizing Flow (NF) as generator in a GAN framework. While the combination of likelihood models and GANs is not trivial, SEFGAN demonstrates that a hybrid adversarial and maximum likelihood training approach enables the model to maintain high quality audio generation and log-likelihood estimation. Our experiments indicate that this approach strongly outperforms the baseline NF-based model without introducing additional complexity to the enhancement network. A comparison using computational metrics and a listening experiment reveals that SEFGAN is competitive with other state-of-the-art models. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: Preprint. Accepted to IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2023

arXiv:2310.18918 [pdf, other]

Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach

Authors: Nurendra Choudhary, Nikhil Rao, Chandan K. Reddy

Abstract: The progress in hyperbolic neural networks (HNNs) research is hindered by their absence of inductive bias mechanisms, which are essential for generalizing to new tasks and facilitating scalable learning over large datasets. In this paper, we aim to alleviate these issues by learning generalizable inductive biases from the nodes' local subgraph and transfer them for faster learning over new subgrap… ▽ More The progress in hyperbolic neural networks (HNNs) research is hindered by their absence of inductive bias mechanisms, which are essential for generalizing to new tasks and facilitating scalable learning over large datasets. In this paper, we aim to alleviate these issues by learning generalizable inductive biases from the nodes' local subgraph and transfer them for faster learning over new subgraphs with a disjoint set of nodes, edges, and labels in a few-shot setting. We introduce a novel method, Hyperbolic GRAph Meta Learner (H-GRAM), that, for the tasks of node classification and link prediction, learns transferable information from a set of support local subgraphs in the form of hyperbolic meta gradients and label hyperbolic protonets to enable faster learning over a query set of new tasks dealing with disjoint subgraphs. Furthermore, we show that an extension of our meta-learning framework also mitigates the scalability challenges seen in HNNs faced by existing approaches. Our comparative analysis shows that H-GRAM effectively learns and transfers information in multiple challenging few-shot settings compared to other state-of-the-art baselines. Additionally, we demonstrate that, unlike standard HNNs, our approach is able to scale over large graph datasets and improve performance over its Euclidean counterparts. △ Less

Submitted 29 October, 2023; originally announced October 2023.

Comments: Accepted to NeurIPS 2023. 14 pages of main paper, 5 pages of supplementary

arXiv:2310.11308 [pdf, other]

doi 10.1103/PhysRevA.109.032435

Protocols for counterfactual and twin-field quantum digital signature

Authors: Vinod N. Rao, Shrikant Utagi, Anirban Pathak, R. Srikanth

Abstract: Quantum digital signature (QDS) is the quantum version of its classical counterpart, and can offer security against attacks of repudiation, signature forging and external eavesdropping, on the basis of quantum mechanical no-go principles. Here we propose a QDS scheme based on quantum counterfactuality, which leverages the concept of interaction-free measurement. Employing the idea behind twin-fiel… ▽ More Quantum digital signature (QDS) is the quantum version of its classical counterpart, and can offer security against attacks of repudiation, signature forging and external eavesdropping, on the basis of quantum mechanical no-go principles. Here we propose a QDS scheme based on quantum counterfactuality, which leverages the concept of interaction-free measurement. Employing the idea behind twin-field cryptography, we show how this two-way protocol can be turned into an equivalent non-counterfactual, one-way protocol, that is both more practical and also theoretically helpful in assessing the experimental feasibility of the first protocol. The proposed QDS protocol can be experimentally implemented with current quantum technology. △ Less

Submitted 19 June, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: 11 pages, 4 figures

Journal ref: Phys. Rev. A 109, 032435 (2024)

arXiv:2310.05972 [pdf, other]

Normality of I-V Measurements Using ML

Authors: Anees Al-Najjar, Nageswara S. V. Rao, Craig A. Bridges, Sheng Dai

Abstract: Electrochemistry ecosystems are promising for accelerating the design and discovery of electrochemical systems for energy storage and conversion, by automating significant parts of workflows that combine synthesis and characterization experiments with computations. They require the integration of flow controllers, solvent containers, pumps, fraction collectors, and potentiostats, all connected to… ▽ More Electrochemistry ecosystems are promising for accelerating the design and discovery of electrochemical systems for energy storage and conversion, by automating significant parts of workflows that combine synthesis and characterization experiments with computations. They require the integration of flow controllers, solvent containers, pumps, fraction collectors, and potentiostats, all connected to an electrochemical cell. These are specialized instruments with custom software that is not originally designed for network integration. We developed network and software solutions for electrochemical workflows that adapt system and instrument settings in real-time for multiple rounds of experiments. We demonstrate this automated workflow by remotely operating the instruments and collecting their measurements to generate a voltammogram (I-V profile) of an electrolyte solution in an electrochemical cell. These measurements are made available at the remote computing system and used for subsequent analysis. In this paper, we focus on a novel, analytically validated machine learning (ML) method for an electrochemistry ecosystem to ensure that I-V measurements are consistent with the normal experimental conditions, and to detect abnormal conditions, such as disconnected electrodes or low cell content volume. △ Less

Submitted 28 September, 2023; originally announced October 2023.

Comments: published at eScience 2023

Journal ref: in 2023 IEEE 19th International Conference on e-Science (e-Science), Limassol, Cyprus, 2023 pp. 1-2

arXiv:2310.02409 [pdf, other]

Dodo: Dynamic Contextual Compression for Decoder-only LMs

Authors: Guanghui Qin, Corby Rosset, Ethan C. Chau, Nikhil Rao, Benjamin Van Durme

Abstract: Transformer-based language models (LMs) are inefficient in long contexts. We propose Dodo, a solution for context compression. Instead of one vector per token in a standard transformer model, Dodo represents text with a dynamic number of hidden states at each layer, reducing the cost of self-attention to a fraction of typical time and space. Moreover, off-the-shelf models such as LLaMA can be adap… ▽ More Transformer-based language models (LMs) are inefficient in long contexts. We propose Dodo, a solution for context compression. Instead of one vector per token in a standard transformer model, Dodo represents text with a dynamic number of hidden states at each layer, reducing the cost of self-attention to a fraction of typical time and space. Moreover, off-the-shelf models such as LLaMA can be adapted to Dodo by efficient parameter tuning methods such as LoRA. In use, Dodo can act as either an autoregressive LM or a context compressor for downstream tasks. We demonstrate through experiments in language modeling, question answering, and summarization that Dodo retains capabilities in these tasks, while drastically reducing the overhead during decoding. For example, in the autoencoding task, Dodo shrinks context at a 20x compression ratio with a BLEU score of 98% for reconstruction, achieving nearly lossless encoding. △ Less

Submitted 13 June, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

Comments: ACL 2024 camera-ready. 15 pages and 7 figures

ACM Class: I.2.7; I.2.6

arXiv:2310.02263 [pdf, other]

Automatic Pair Construction for Contrastive Post-training

Authors: Canwen Xu, Corby Rosset, Ethan C. Chau, Luciano Del Corro, Shweti Mahajan, Julian McAuley, Jennifer Neville, Ahmed Hassan Awadallah, Nikhil Rao

Abstract: Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-funct… ▽ More Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-function improvement even after continuing SFT saturates. We also explore a data curriculum learning scheme for contrastive post-training, which starts by learning from "easier" pairs and transitioning to "harder" ones, which further improves alignment. Finally, we scale up our experiments to train with more data and larger models like Orca. Remarkably, our automatic contrastive post-training further improves the performance of Orca, already a state-of-the-art instruction learning model tuned with GPT-4 outputs, to outperform ChatGPT. △ Less

Submitted 2 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

Comments: NAACL 2024 (Findings)

arXiv:2310.01602 [pdf, other]

CAT-LM: Training Language Models on Aligned Code And Tests

Authors: Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn

Abstract: Testing is an integral part of the software development process. Yet, writing tests is time-consuming and therefore often neglected. Classical test generation tools such as EvoSuite generate behavioral test suites by optimizing for coverage, but tend to produce tests that are hard to understand. Language models trained on code can generate code that is highly similar to that written by humans, but… ▽ More Testing is an integral part of the software development process. Yet, writing tests is time-consuming and therefore often neglected. Classical test generation tools such as EvoSuite generate behavioral test suites by optimizing for coverage, but tend to produce tests that are hard to understand. Language models trained on code can generate code that is highly similar to that written by humans, but current models are trained to generate each file separately, as is standard practice in natural language processing, and thus fail to consider the code-under-test context when producing a test file. In this work, we propose the Aligned Code And Tests Language Model (CAT-LM), a GPT-style language model with 2.7 Billion parameters, trained on a corpus of Python and Java projects. We utilize a novel pretraining signal that explicitly considers the mapping between code and test files when available. We also drastically increase the maximum sequence length of inputs to 8,192 tokens, 4x more than typical code generation models, to ensure that the code context is available to the model when generating test code. We analyze its usefulness for realistic applications, showing that sampling with filtering (e.g., by compilability, coverage) allows it to efficiently produce tests that achieve coverage similar to ones written by developers while resembling their writing style. By utilizing the code context, CAT-LM generates more valid tests than even much larger language models trained with more data (CodeGen 16B and StarCoder) and substantially outperforms a recent test-specific model (TeCo) at test completion. Overall, our work highlights the importance of incorporating software-specific insights when training language models for code and paves the way to more powerful automated test generation. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.11512 [pdf, other]

Multidimensional well-being of US households at a fine spatial scale using fused household surveys: fusionACS

Authors: Kevin Ummel, Miguel Poblete-Cazenave, Karthik Akkiraju, Nick Graetz, Hero Ashman, Cora Kingdon, Steven Herrera Tenorio, Aaryaman "Sunny" Singhal, Daniel Aldana Cohen, Narasimha D. Rao

Abstract: Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistical… ▽ More Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistically "fusing" variables from "donor" surveys onto American Community Survey (ACS) microdata. This results in an integrated microdataset of household attributes and well-being dimensions that can be analyzed to address research questions in ways that are not currently possible. The presented data comprise the fusion onto the ACS of select donor variables from the Residential Energy Consumption Survey (RECS) of 2015, the National Household Transportation Survey (NHTS) of 2017, the American Housing Survey (AHS) of 2019, and the Consumer Expenditure Survey - Interview (CEI) for the years 2015-2019. The underlying statistical techniques are included in an open-source $R$ package, fusionModel, that provides generic tools for the creation, analysis, and validation of fused microdata. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 35 pages, 6 figures

arXiv:2308.16054 [pdf]

Capital Structure Dynamics and Financial Performance in Indian Banks (An Analysis of Mergers and Acquisitions)

Authors: Kurada T S S Satyanarayana, Addada Narasimha Rao, Kumpatla jaya surya

Abstract: This research investigates the multifaceted relationship underlying capital structure dynamics along with financial performance as a result of mergers and acquisitions, or M&As, in Indian banks. In the face of increasing competition, banks have deliberately embraced M&A as a strategy of improving commercial prospects and maintaining financial stability. The primary goal of this study is to examine… ▽ More This research investigates the multifaceted relationship underlying capital structure dynamics along with financial performance as a result of mergers and acquisitions, or M&As, in Indian banks. In the face of increasing competition, banks have deliberately embraced M&A as a strategy of improving commercial prospects and maintaining financial stability. The primary goal of this study is to examine the changes in the capital framework and financial results of banks before and after M&A transactions. The investigation, which employs a paired t-test as a method of statistical analysis, is based on a review of annual reports from selected banks over a two-year period before and after M&A transactions. The paired t-test approach allows for a thorough statistical analysis of interconnected datasets, revealing the subtle influence of M&A attempts on both bank financial performance as well as capital structure dynamics. The study's findings have the potential to add to the current body of knowledge on organisational planning, managing finances, and capital structure optimisation. The research has practical significance for financial companies, legislators, and scholars interested in understanding the profound effects of M&A inside the arena of financial institutions that operate within fiercely competitive landscapes because it provides comprehensive insights regarding the complex consequences of banking merger and acquisition (M&A) deals on capital structure as well as financial performance. Finally, the goal of this research is to provide the banking sector with educated decision-making capabilities and strategic guidance to businesses facing heightened competition while coping with the complexities of capital structure. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 12 pages, 4 tables

arXiv:2308.11809 [pdf, other]

Expressive probabilistic sampling in recurrent neural networks

Authors: Shirui Chen, Linxing Preston Jiang, Rajesh P. N. Rao, Eric Shea-Brown

Abstract: In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to… ▽ More In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to explore the minimum architectural requirements for $\textit{recurrent}$ neural circuits to sample from complex distributions. We first consider the traditional sampling model consisting of a network of neurons whose outputs directly represent the samples (sampler-only network). We argue that synaptic current and firing-rate dynamics in the traditional model have limited capacity to sample from a complex probability distribution. We show that the firing rate dynamics of a recurrent neural circuit with a separate set of output units can sample from an arbitrary probability distribution. We call such circuits reservoir-sampler networks (RSNs). We propose an efficient training procedure based on denoising score matching that finds recurrent and output weights such that the RSN implements Langevin sampling. We empirically demonstrate our model's ability to sample from several complex data distributions using the proposed neural dynamics and discuss its applicability to developing the next generation of sampling-based brain models. △ Less

Submitted 14 November, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

arXiv:2308.07870 [pdf, other]

Brain-Inspired Computational Intelligence via Predictive Coding

Authors: Tommaso Salvatori, Ankur Mali, Christopher L. Buckley, Thomas Lukasiewicz, Rajesh P. N. Rao, Karl Friston, Alexander Ororbia

Abstract: Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying unc… ▽ More Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large. △ Less

Submitted 15 August, 2023; originally announced August 2023.

Comments: 37 Pages, 9 Figures

arXiv:2308.07600 [pdf]

Impact of Oxygen Pressure on Ferroelectric Stability of La-doped Hafnia Grown by PLD

Authors: Badari Narayana Rao, Shintaro Yasui, Hiroko Yokota

Abstract: Thin films of HfO2 doped with 4% La were fabricated on LSMO/STO (100) substrates using pulsed laser deposition. The stability of the ferroelectric orthorhombic phase in the hafnia films was investigated with respect to varying oxygen pressure during deposition. X-ray diffraction and X-ray photoelectron spectroscopy measurements were carried out to analyze the structure and composition of the films… ▽ More Thin films of HfO2 doped with 4% La were fabricated on LSMO/STO (100) substrates using pulsed laser deposition. The stability of the ferroelectric orthorhombic phase in the hafnia films was investigated with respect to varying oxygen pressure during deposition. X-ray diffraction and X-ray photoelectron spectroscopy measurements were carried out to analyze the structure and composition of the films and correlated with their ferroelectric properties. Surprisingly, the ferroelectricity of the hafnia films showed a dependence on oxygen pressure during deposition of LSMO bottom electrode as well. The reason for this dependence is discussed in terms of the active role of non-lattice oxygen in the ferroelectric switching of hafnia. △ Less

Submitted 15 August, 2023; originally announced August 2023.

arXiv:2307.14049 [pdf]

Capital Structure Theories and its Practice, A study with reference to select NSE listed public sectors banks, India

Authors: Kurada T S S Satyanarayana, Addada Narasimha Rao

Abstract: Among the various factors affecting the firms positioning and performance in modern day markets, capital structure of the firm has its own way of expressing itself as a crucial one. With the rapid changes in technology, firms are being pushed onto a paradigm that is burdening the capital management process. Hence the study of capital structure changes gives the investors an insight into firm's beh… ▽ More Among the various factors affecting the firms positioning and performance in modern day markets, capital structure of the firm has its own way of expressing itself as a crucial one. With the rapid changes in technology, firms are being pushed onto a paradigm that is burdening the capital management process. Hence the study of capital structure changes gives the investors an insight into firm's behavior and intrinsic goals. These changes will vary for firms in different sectors. This work considers the banking sector, which has a unique capital structure for the given regulations of its operations in India. The capital structure behavioral changes in a few public sector banks are studied in this paper. A theoretical framework has been developed from the popular capital structure theories and hypotheses are derived from them accordingly. The main idea is to validate different theories with real time performance of the select banks from 2011 to 2022. Using statistical techniques like regression and correlation, tested hypotheses have resulted in establishing the relation between debt component and financial performance variables of the select banks which are helping in understanding the theories in practice. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: 19 PAGES, 8 FIGURES

arXiv:2307.06883 [pdf, other]

Cyber Framework for Steering and Measurements Collection Over Instrument-Computing Ecosystems

Authors: Anees Al-Najjar, Nageswara S. V. Rao, Ramanan Sankaran, Helia Zandi, Debangshu Mukherjee, Maxim Ziatdinov, Craig Bridges

Abstract: We propose a framework to develop cyber solutions to support the remote steering of science instruments and measurements collection over instrument-computing ecosystems. It is based on provisioning separate data and control connections at the network level, and developing software modules consisting of Python wrappers for instrument commands and Pyro server-client codes that make them available ac… ▽ More We propose a framework to develop cyber solutions to support the remote steering of science instruments and measurements collection over instrument-computing ecosystems. It is based on provisioning separate data and control connections at the network level, and developing software modules consisting of Python wrappers for instrument commands and Pyro server-client codes that make them available across the ecosystem network. We demonstrate automated measurement transfers and remote steering operations in a microscopy use case for materials research over an ecosystem of Nion microscopes and computing platforms connected over site networks. The proposed framework is currently under further refinement and being adopted to science workflows with automated remote experiments steering for autonomous chemistry laboratories and smart energy grid simulations. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: Paper accepted for presentation at IEEE SMARTCOMP 2023

arXiv:2306.07527 [pdf, other]

doi 10.1145/3593013.3594115

Discrimination through Image Selection by Job Advertisers on Facebook

Authors: Varun Nagaraj Rao, Aleksandra Korolova

Abstract: Targeted advertising platforms are widely used by job advertisers to reach potential employees; thus issues of discrimination due to targeting that have surfaced have received widespread attention. Advertisers could misuse targeting tools to exclude people based on gender, race, location and other protected attributes from seeing their job ads. In response to legal actions, Facebook disabled the a… ▽ More Targeted advertising platforms are widely used by job advertisers to reach potential employees; thus issues of discrimination due to targeting that have surfaced have received widespread attention. Advertisers could misuse targeting tools to exclude people based on gender, race, location and other protected attributes from seeing their job ads. In response to legal actions, Facebook disabled the ability for explicit targeting based on many attributes for some ad categories, including employment. Although this is a step in the right direction, prior work has shown that discrimination can take place not just due to the explicit targeting tools of the platforms, but also due to the impact of the biased ad delivery algorithm. Thus, one must look at the potential for discrimination more broadly, and not merely through the lens of the explicit targeting tools. In this work, we propose and investigate the prevalence of a new means for discrimination in job advertising, that combines both targeting and delivery -- through the disproportionate representation or exclusion of people of certain demographics in job ad images. We use the Facebook Ad Library to demonstrate the prevalence of this practice through: (1) evidence of advertisers running many campaigns using ad images of people of only one perceived gender, (2) systematic analysis for gender representation in all current ad campaigns for truck drivers and nurses, (3) longitudinal analysis of ad campaign image use by gender and race for select advertisers. After establishing that the discrimination resulting from a selective choice of people in job ad images, combined with algorithmic amplification of skews by the ad delivery algorithm, is of immediate concern, we discuss approaches and challenges for addressing it. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: Published in FAccT 2023

arXiv:2306.05912 [pdf, other]

Single-Image-Based Deep Learning for Segmentation of Early Esophageal Cancer Lesions

Authors: Haipeng Li, Dingrui Liu, Yu Zeng, Shuaicheng Liu, Tao Gan, Nini Rao, Jinlin Yang, Bing Zeng

Abstract: Accurate segmentation of lesions is crucial for diagnosis and treatment of early esophageal cancer (EEC). However, neither traditional nor deep learning-based methods up to today can meet the clinical requirements, with the mean Dice score - the most important metric in medical image analysis - hardly exceeding 0.75. In this paper, we present a novel deep learning approach for segmenting EEC lesio… ▽ More Accurate segmentation of lesions is crucial for diagnosis and treatment of early esophageal cancer (EEC). However, neither traditional nor deep learning-based methods up to today can meet the clinical requirements, with the mean Dice score - the most important metric in medical image analysis - hardly exceeding 0.75. In this paper, we present a novel deep learning approach for segmenting EEC lesions. Our approach stands out for its uniqueness, as it relies solely on a single image coming from one patient, forming the so-called "You-Only-Have-One" (YOHO) framework. On one hand, this "one-image-one-network" learning ensures complete patient privacy as it does not use any images from other patients as the training data. On the other hand, it avoids nearly all generalization-related problems since each trained network is applied only to the input image itself. In particular, we can push the training to "over-fitting" as much as possible to increase the segmentation accuracy. Our technical details include an interaction with clinical physicians to utilize their expertise, a geometry-based rendering of a single lesion image to generate the training set (the \emph{biggest} novelty), and an edge-enhanced UNet. We have evaluated YOHO over an EEC data-set created by ourselves and achieved a mean Dice score of 0.888, which represents a significant advance toward clinical applications. △ Less

Submitted 9 June, 2023; originally announced June 2023.

arXiv:2306.04907 [pdf, other]

Estimation of Poverty Measures for Small Areas Under a Two-Fold Nested Error Linear Regression Model: Comparison of Two Methods

Authors: Maryam Sohrabi, J. N. K. Rao

Abstract: Demand for reliable statistics at a local area (small area) level has greatly increased in recent years. Traditional area-specific estimators based on probability samples are not adequate because of small sample size or even zero sample size in a local area. As a result, methods based on models linking the areas are widely used. World Bank focused on estimating poverty measures, in particular pove… ▽ More Demand for reliable statistics at a local area (small area) level has greatly increased in recent years. Traditional area-specific estimators based on probability samples are not adequate because of small sample size or even zero sample size in a local area. As a result, methods based on models linking the areas are widely used. World Bank focused on estimating poverty measures, in particular poverty incidence and poverty gap called FGT measures, using a simulated census method, called ELL, based on a one-fold nested error model for a suitable transformation of the welfare variable. Modified ELL methods leading to significant gain in efficiency over ELL also have been proposed under the one-fold model. An advantage of ELL and modified ELL methods is that distributional assumptions on the random effects in the model are not needed. In this paper, we extend ELL and modified ELL to two-fold nested error models to estimate poverty indicators for areas (say a state) and subareas (say counties within a state). Our simulation results indicate that the modified ELL estimators lead to large efficiency gains over ELL at the area level and subarea level. Further, modified ELL method retaining both area and subarea estimated effects in the model (called MELL2) performs significantly better in terms of mean squared error (MSE) for sampled subareas than the modified ELL retaining only estimated area effect in the model (called MELL1). △ Less

Submitted 7 June, 2023; originally announced June 2023.

arXiv:2305.20015 [pdf, other]

AI for Low-Code for AI

Authors: Nikitha Rao, Jason Tsay, Kiran Kate, Vincent J. Hellendoorn, Martin Hirzel

Abstract: Low-code programming allows citizen developers to create programs with minimal coding effort, typically via visual (e.g. drag-and-drop) interfaces. In parallel, recent AI-powered tools such as Copilot and ChatGPT generate programs from natural language instructions. We argue that these modalities are complementary: tools like ChatGPT greatly reduce the need to memorize large APIs but still require… ▽ More Low-code programming allows citizen developers to create programs with minimal coding effort, typically via visual (e.g. drag-and-drop) interfaces. In parallel, recent AI-powered tools such as Copilot and ChatGPT generate programs from natural language instructions. We argue that these modalities are complementary: tools like ChatGPT greatly reduce the need to memorize large APIs but still require their users to read (and modify) programs, whereas visual tools abstract away most or all programming but struggle to provide easy access to large APIs. At their intersection, we propose LowCoder, the first low-code tool for developing AI pipelines that supports both a visual programming interface (LowCoder_VP) and an AI-powered natural language interface (LowCoder_NL). We leverage this tool to provide some of the first insights into whether and how these two modalities help programmers by conducting a user study. We task 20 developers with varying levels of AI expertise with implementing four ML pipelines using LowCoder, replacing the LowCoder_NL component with a simple keyword search in half the tasks. Overall, we find that LowCoder is especially useful for (i) Discoverability: using LowCoder_NL, participants discovered new operators in 75% of the tasks, compared to just 32.5% and 27.5% using web search or scrolling through options respectively in the keyword-search condition, and (ii) Iterative Composition: 82.5% of tasks were successfully completed and many initial pipelines were further successfully improved. Qualitative analysis shows that AI helps users discover how to implement constructs when they know what to do, but still fails to support novices when they lack clarity on what they want to accomplish. Overall, our work highlights the benefits of combining the power of AI with low-code programming. △ Less

Submitted 31 May, 2023; originally announced May 2023.

arXiv:2305.09887 [pdf, other]

Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation

Authors: Jiong Zhu, Aishwarya Reganti, Edward Huang, Charles Dickens, Nikhil Rao, Karthik Subbian, Danai Koutra

Abstract: Distributed training of GNNs enables learning on massive graphs (e.g., social and e-commerce networks) that exceed the storage and computational capacity of a single machine. To reach performance comparable to centralized training, distributed frameworks focus on maximally recovering cross-instance node dependencies with either communication across instances or periodic fallback to centralized tra… ▽ More Distributed training of GNNs enables learning on massive graphs (e.g., social and e-commerce networks) that exceed the storage and computational capacity of a single machine. To reach performance comparable to centralized training, distributed frameworks focus on maximally recovering cross-instance node dependencies with either communication across instances or periodic fallback to centralized training, which create overhead and limit the framework scalability. In this work, we present a simplified framework for distributed GNN training that does not rely on the aforementioned costly operations, and has improved scalability, convergence speed and performance over the state-of-the-art approaches. Specifically, our framework (1) assembles independent trainers, each of which asynchronously learns a local model on locally-available parts of the training graph, and (2) only conducts periodic (time-based) model aggregation to synchronize the local models. Backed by our theoretical analysis, instead of maximizing the recovery of cross-instance node dependencies -- which has been considered the key behind closing the performance gap between model aggregation and centralized training -- , our framework leverages randomized assignment of nodes or super-nodes (i.e., collections of original nodes) to partition the training graph such that it improves data uniformity and minimizes the discrepancy of gradient and loss function across instances. In our experiments on social and e-commerce networks with up to 1.3 billion edges, our proposed RandomTMA and SuperTMA approaches -- despite using less training data -- achieve state-of-the-art performance and 2.31x speedup compared to the fastest baseline, and show better robustness to trainer failures. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: 14 pages, 3 figures

arXiv:2304.14479 [pdf, ps, other]

Long-term cybersecurity applications enabled by quantum networks

Authors: Nicholas A. Peters, Muneer Alshowkan, Joseph C. Chapman, Raphael C. Pooser, Nageswara S. V. Rao, Raymond T. Newell

Abstract: If continental-scale quantum networks are realized, they will provide the resources needed to fulfill the potential for dramatic advances in cybersecurity through quantum-enabled cryptography applications. We describe recent progress and where the US is headed as well as argue that we go one step further and jointly develop quantum and conventional cryptography methods for joint deployments along… ▽ More If continental-scale quantum networks are realized, they will provide the resources needed to fulfill the potential for dramatic advances in cybersecurity through quantum-enabled cryptography applications. We describe recent progress and where the US is headed as well as argue that we go one step further and jointly develop quantum and conventional cryptography methods for joint deployments along the quantum backbone infrastructure. △ Less

Submitted 27 April, 2023; originally announced April 2023.

arXiv:2304.10053 [pdf, other]

doi 10.1364/OE.492539

Two-mode squeezing over deployed fiber coexisting with conventional communications

Authors: Joseph C. Chapman, Alexander Miloshevsky, Hsuan-Hao Lu, Nageswara Rao, Muneer Alshowkan, Nicholas A. Peters

Abstract: Squeezed light is a crucial resource for continuous-variable (CV) quantum information science. Distributed multi-mode squeezing is critical for enabling CV quantum networks and distributed quantum sensing. To date, multi-mode squeezing measured by homodyne detection has been limited to single-room experiments without coexisting classical signals, i.e., on ``dark'' fiber. Here, after distribution t… ▽ More Squeezed light is a crucial resource for continuous-variable (CV) quantum information science. Distributed multi-mode squeezing is critical for enabling CV quantum networks and distributed quantum sensing. To date, multi-mode squeezing measured by homodyne detection has been limited to single-room experiments without coexisting classical signals, i.e., on ``dark'' fiber. Here, after distribution through separate fiber spools (5~km), $-0.9\pm0.1$-dB coexistent two-mode squeezing is measured. Moreover, after distribution through separate deployed campus fibers (about 250~m and 1.2~km), $-0.5\pm0.1$-dB coexistent two-mode squeezing is measured. Prior to distribution, the squeezed modes are each frequency multiplexed with several classical signals -- including the local oscillator and conventional network signals -- demonstrating that the squeezed modes do not need dedicated dark fiber. After distribution, joint two-mode squeezing is measured and recorded for post-processing using triggered homodyne detection in separate locations. This demonstration enables future applications in quantum networks and quantum sensing that rely on distributed multi-mode squeezing. △ Less

Submitted 12 July, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

Comments: 23 pages, 13 figures, 2 tables

arXiv:2304.02048 [pdf]

doi 10.1038/s41524-023-01142-0

Deep Learning for Automated Experimentation in Scanning Transmission Electron Microscopy

Authors: Sergei V. Kalinin, Debangshu Mukherjee, Kevin M. Roccapriore, Ben Blaiszik, Ayana Ghosh, Maxim A. Ziatdinov, A. Al-Najjar, Christina Doty, Sarah Akers, Nageswara S. Rao, Joshua C. Agar, Steven R. Spurgeon

Abstract: Machine learning (ML) has become critical for post-acquisition data analysis in (scanning) transmission electron microscopy, (S)TEM, imaging and spectroscopy. An emerging trend is the transition to real-time analysis and closed-loop microscope operation. The effective use of ML in electron microscopy now requires the development of strategies for microscopy-centered experiment workflow design and… ▽ More Machine learning (ML) has become critical for post-acquisition data analysis in (scanning) transmission electron microscopy, (S)TEM, imaging and spectroscopy. An emerging trend is the transition to real-time analysis and closed-loop microscope operation. The effective use of ML in electron microscopy now requires the development of strategies for microscopy-centered experiment workflow design and optimization. Here, we discuss the associated challenges with the transition to active ML, including sequential data analysis and out-of-distribution drift effects, the requirements for the edge operation, local and cloud data storage, and theory in the loop operations. Specifically, we discuss the relative contributions of human scientists and ML agents in the ideation, orchestration, and execution of experimental workflows and the need to develop universal hyper languages that can apply across multiple platforms. These considerations will collectively inform the operationalization of ML in next-generation experimentation. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: Review Article

arXiv:2304.00137 [pdf, ps, other]

doi 10.1103/PhysRevD.109.L121101

Measurement of the cosmic p+He energy spectrum from 50 GeV to 0.5 PeV with the DAMPE space mission

Authors: DAMPE Collaboration, F. Alemanno, C. Altomare, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, I. Cagnoli, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, P. Coppin, M. Y. Cui, T. S. Cui, Y. X. Cui, H. T. Dai, A. De Benedittis, I. De Mitri, F. de Palma, M. Deliyergiyev , et al. (130 additional authors not shown)

Abstract: Recent observations of the light component of the cosmic-ray spectrum have revealed unexpected features that motivate further and more precise measurements up to the highest energies. The Dark Matter Particle Explorer is a satellite-based cosmic-ray experiment that has been operational since December 2015, continuously collecting data on high-energy cosmic particles with very good statistics, ener… ▽ More Recent observations of the light component of the cosmic-ray spectrum have revealed unexpected features that motivate further and more precise measurements up to the highest energies. The Dark Matter Particle Explorer is a satellite-based cosmic-ray experiment that has been operational since December 2015, continuously collecting data on high-energy cosmic particles with very good statistics, energy resolution, and particle identification capabilities. In this work, the latest measurements of the energy spectrum of proton+helium in the energy range from 46 GeV to 464 TeV are presented. Among the most distinctive features of the spectrum, a spectral hardening at 600 GeV has been observed, along with a softening at 29 TeV measured with a 6.6σ significance. Moreover, the detector features and the analysis approach allowed for the extension of the spectral measurement up to the sub-PeV region. Even if with small statistical significance due to the low number of events, data suggest a new spectral hardening at about 150 TeV. △ Less

Submitted 14 August, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

Comments: Published on PRD

arXiv:2303.12436 [pdf, other]

doi 10.3847/1538-4357/acc574

The carbon star DY Persei may be a cool R Coronae Borealis variable

Authors: D. A. Garcia-Hernandez, N. Kameswara Rao, D. L. Lambert, K. Eriksson, A. B. S. Reddy, T. Masseron

Abstract: Optical and near-IR photometry suggests that the carbon star DY Persei exhibits fadings similar to those of R Coronae Borealis (RCB) variables. Photometric surveys of the Galaxy and Magellanic Clouds uncovered new DY Per variables with infrared photometry identifying them with cool carbon stars, perhaps, with an unusual tendency to shed mass. In an attempt to resolve DY Per's identity crisis -- a… ▽ More Optical and near-IR photometry suggests that the carbon star DY Persei exhibits fadings similar to those of R Coronae Borealis (RCB) variables. Photometric surveys of the Galaxy and Magellanic Clouds uncovered new DY Per variables with infrared photometry identifying them with cool carbon stars, perhaps, with an unusual tendency to shed mass. In an attempt to resolve DY Per's identity crisis -- a cool carbon giant or a cool RCB variable? -- we analyze a high-resolution H&K band spectrum of DY Per. The CO first-overtone bands in the K-band of DY Per show a high abundance of 18O such that 16O/18O = 4 +- 1, a ratio sharply at odds with published results for `regular' cool carbon giants with 16O/18O ~ 1000 but this exceptionally low ratio is characteristic of RCB-variables and HdC stars. This similarity suggests that DY Per indeed may be a cool RCB variable. Current opinion considers RCB-variables to result from merger of a He onto a CO white dwarf; observed abundances of these H-deficient stars including the exceptionally low 16O/18O ratios are in fair accord with predicted compositions for white dwarf merger products. A H-deficiency for DY Per is not directly observable but is suggested from the strength of a HF line and an assumption that F may be overabundant, as observed and predicted for RCB stars. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: Accepted for publication in ApJ (16 pages and 4 figures)

arXiv:2302.14189 [pdf, other]

You Only Transfer What You Share: Intersection-Induced Graph Transfer Learning for Link Prediction

Authors: Wenqing Zheng, Edward W Huang, Nikhil Rao, Zhangyang Wang, Karthik Subbian

Abstract: Link prediction is central to many real-world applications, but its performance may be hampered when the graph of interest is sparse. To alleviate issues caused by sparsity, we investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph. The denser graph may share nodes with the original graph, which offers a natural b… ▽ More Link prediction is central to many real-world applications, but its performance may be hampered when the graph of interest is sparse. To alleviate issues caused by sparsity, we investigate a previously overlooked phenomenon: in many cases, a densely connected, complementary graph can be found for the original graph. The denser graph may share nodes with the original graph, which offers a natural bridge for transferring selective, meaningful knowledge. We identify this setting as Graph Intersection-induced Transfer Learning (GITL), which is motivated by practical applications in e-commerce or academic co-authorship predictions. We develop a framework to effectively leverage the structural prior in this setting. We first create an intersection subgraph using the shared nodes between the two graphs, then transfer knowledge from the source-enriched intersection subgraph to the full target graph. In the second step, we consider two approaches: a modified label propagation, and a multi-layer perceptron (MLP) model in a teacher-student regime. Experimental results on proprietary e-commerce datasets and open-source citation graphs show that the proposed workflow outperforms existing transfer learning baselines that do not explicitly utilize the intersection structure. △ Less

Submitted 18 June, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: Accepted in TMLR (https://openreview.net/forum?id=Nn71AdKyYH)

arXiv:2301.05513 [pdf]

Exploring the substrate-driven morphological changes in Nd0.6Sr0.4MnO3 thin films

Authors: R S Mrinaleni, E P Amaladass, S Amirthapandian, A. T. Sathyanarayana, Jegadeesan P, Ganesan K, R M Sarguna, P. N. Rao, Pooja Gupta, T Geetha Kumary, S. K. Rai, Awadhesh Mani

Abstract: Manganite thin films are promising candidates for studying the strongly correlated electron systems. Understanding the growth-and morphology-driven changes in the physical properties of manganite thin films is vital for their applications in oxitronics. This work reports the morphological, structural, and electrical transport properties of nanostructured Nd0.6Sr0.4MnO3 (NSMO) thin films fabricated… ▽ More Manganite thin films are promising candidates for studying the strongly correlated electron systems. Understanding the growth-and morphology-driven changes in the physical properties of manganite thin films is vital for their applications in oxitronics. This work reports the morphological, structural, and electrical transport properties of nanostructured Nd0.6Sr0.4MnO3 (NSMO) thin films fabricated using the pulsed laser deposition technique. Scanning electron microscopy (SEM) imaging of the thin films revealed two prominent surface morphologies: a granular and a unique crossed-nano-rod-type morphology. From X-ray diffraction (XRD) and atomic force microscopy (AFM) analysis, we found that the observed nanostructures resulted from altered growth modes occurring on the terraced substrate surface. Furthermore, investigations on the electrical-transport properties of thin films revealed that the films with crossed-nano-rod type morphology showed a sharp resistive transition near the metal-to-insulator transition (MIT). An enhanced temperature coefficient of resistance (TCR) of up to one order of magnitude was also observed compared to the films with granular morphology. Such enhancement in TCR % by tuning the morphology makes these thin films promising candidates for developing oxide-based temperature sensors and detectors. △ Less

Submitted 13 January, 2023; originally announced January 2023.

Comments: Main article : page 1-23 , Supplementary information: page 23-27

arXiv:2301.01943 [pdf, other]

doi 10.3847/1538-4357/acb0c8

UOCS-IX. AstroSat/UVIT study of the open cluster NGC 2818: Blue Stragglers, Yellow Stragglers, Planetary Nebula, and their membership

Authors: Sharmila Rani, Gajendra Pandey, Annapurni Subramaniam, N. Kameswara Rao

Abstract: We present the first far-UV (FUV) imaging results of the intermediate-age Galactic open cluster NGC 2818 that has a Planetary nebula (PN) within the field using images taken from the Ultra-violet Imaging Telescope (UVIT) aboard AstroSat. We identify cluster members by combining UVIT-detected sources with Gaia EDR3 data. We detect four bright and hot blue straggler stars (BSSs) and two yellow strag… ▽ More We present the first far-UV (FUV) imaging results of the intermediate-age Galactic open cluster NGC 2818 that has a Planetary nebula (PN) within the field using images taken from the Ultra-violet Imaging Telescope (UVIT) aboard AstroSat. We identify cluster members by combining UVIT-detected sources with Gaia EDR3 data. We detect four bright and hot blue straggler stars (BSSs) and two yellow straggler stars (YSSs) based on their location in the optical and FUV-optical color-magnitude diagrams. Based on the parameters estimated using Spectral Energy Distribution (SED), we infer that BSSs are either collisional products or might have undetectable white dwarf (WD) companions. Our photometric analysis of YSSs confirms their binarity, consistent with the spectroscopic results. We find YSSs to be formed through a mass-transfer scenario and the hot components are likely to be A-type subdwarfs. A comparison of the radial velocity (RV), Gaia EDR3 proper motion of the PN with the cluster, and reddening towards the PN and the cluster does not rule out the membership of the PN. Comparing the central star's position with theoretical pAGB models suggest that it has already entered the WD cooling phase, and its mass is deduced to be ~0.66Msun. The corresponding progenitor mass turns out to be ~2.1Msun, comparable to the turn-off mass of the cluster, implying that the progenitor could have formed in the cluster. We suggest that the NGC 2818 might be one of the few known clusters to host a PN, providing a unique opportunity to test stellar evolution models. △ Less

Submitted 5 January, 2023; originally announced January 2023.

Comments: 20 pages, 12 figures, 4 tables, Accepted for publication in ApJ

arXiv:2211.14261 [pdf, ps, other]

Temporal Waypoint Navigation of Multi-UAV Payload System using Barrier Functions

Authors: Nishanth Rao, Suresh Sundaram, Pushpak Jagtap

Abstract: Aerial package transportation often requires complex spatial and temporal specifications to be satisfied in order to ensure safe and timely delivery from one point to another. It is usually efficient to transport versatile payloads using multiple UAVs that can work collaboratively to achieve the desired task. The complex temporal specifications can be handled coherently by applying Signal Temporal… ▽ More Aerial package transportation often requires complex spatial and temporal specifications to be satisfied in order to ensure safe and timely delivery from one point to another. It is usually efficient to transport versatile payloads using multiple UAVs that can work collaboratively to achieve the desired task. The complex temporal specifications can be handled coherently by applying Signal Temporal Logic (STL) to dynamical systems. This paper addresses the problem of waypoint navigation of a multi-UAV payload system under temporal specifications using higher-order time-varying control barrier functions (HOCBFs). The complex nonlinear system of relative degree two is transformed into a simple linear system using input-output feedback linearization. An optimization-based control law is then derived to achieve the temporal waypoint navigation of the payload. The controller's efficacy and real-time implementability are demonstrated by simulating a package delivery scenario inside a high-fidelity Gazebo simulation environment. △ Less

Submitted 25 November, 2022; originally announced November 2022.

Comments: Submitted to ECC 2023

arXiv:2211.13328 [pdf, other]

Search Behavior Prediction: A Hypergraph Perspective

Authors: Yan Han, Edward W Huang, Wenqing Zheng, Nikhil Rao, Zhangyang Wang, Karthik Subbian

Abstract: Although the bipartite shopping graphs are straightforward to model search behavior, they suffer from two challenges: 1) The majority of items are sporadically searched and hence have noisy/sparse query associations, leading to a \textit{long-tail} distribution. 2) Infrequent queries are more likely to link to popular items, leading to another hurdle known as \textit{disassortative mixing}. To add… ▽ More Although the bipartite shopping graphs are straightforward to model search behavior, they suffer from two challenges: 1) The majority of items are sporadically searched and hence have noisy/sparse query associations, leading to a \textit{long-tail} distribution. 2) Infrequent queries are more likely to link to popular items, leading to another hurdle known as \textit{disassortative mixing}. To address these two challenges, we go beyond the bipartite graph to take a hypergraph perspective, introducing a new paradigm that leverages \underline{auxiliary} information from anonymized customer engagement sessions to assist the \underline{main task} of query-item link prediction. This auxiliary information is available at web scale in the form of search logs. We treat all items appearing in the same customer session as a single hyperedge. The hypothesis is that items in a customer session are unified by a common shopping interest. With these hyperedges, we augment the original bipartite graph into a new \textit{hypergraph}. We develop a \textit{\textbf{D}ual-\textbf{C}hannel \textbf{A}ttention-Based \textbf{H}ypergraph Neural Network} (\textbf{DCAH}), which synergizes information from two potentially noisy sources (original query-item edges and item-item hyperedges). In this way, items on the tail are better connected due to the extra hyperedges, thereby enhancing their link prediction performance. We further integrate DCAH with self-supervised graph pre-training and/or DropEdge training, both of which effectively alleviate disassortative mixing. Extensive experiments on three proprietary E-Commerce datasets show that DCAH yields significant improvements of up to \textbf{24.6\% in mean reciprocal rank (MRR)} and \textbf{48.3\% in recall} compared to GNN-based baselines. Our source code is available at \url{https://github.com/amazon-science/dual-channel-hypergraph-neural-network}. △ Less

Submitted 28 November, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

Comments: WSDM 2023

arXiv:2211.06548 [pdf, ps, other]

Computationally Light Spectrally Normalized Memory Neuron Network based Estimator for GPS-Denied operation of Micro UAV

Authors: Nishanth Rao, Suresh Sundaram, Varun Raghavendra

Abstract: This paper addresses the problem of position estimation in UAVs operating in a cluttered environment where GPS information is unavailable. A model learning-based approach is proposed that takes in the rotor RPMs and past state as input and predicts the one-step-ahead position of the UAV using a novel spectral-normalized memory neural network (SN-MNN). The spectral normalization guarantees stable a… ▽ More This paper addresses the problem of position estimation in UAVs operating in a cluttered environment where GPS information is unavailable. A model learning-based approach is proposed that takes in the rotor RPMs and past state as input and predicts the one-step-ahead position of the UAV using a novel spectral-normalized memory neural network (SN-MNN). The spectral normalization guarantees stable and reliable prediction performance. The predicted position is transformed to global coordinate frame which is then fused along with the odometry of other peripheral sensors like IMU, barometer, compass etc., using the onboard extended Kalman filter to estimate the states of the UAV. The experimental flight data collected from a motion capture facility using a micro-UAV is used to train the SN-MNN. The PX4-ECL library is used to replay the flight data using the proposed algorithm, and the estimated position is compared with actual ground truth data. The proposed algorithm doesn't require any additional onboard sensors, and is computationally light. The performance of the proposed approach is compared with the current state-of-art GPS-denied algorithms, and it can be seen that the proposed algorithm has the least RMSE for position estimates. △ Less

Submitted 3 December, 2022; v1 submitted 11 November, 2022; originally announced November 2022.

Comments: Submitted to L4DC 2023

Showing 1–50 of 360 results for author: Rao, N