-
Active Predictive Coding: A Unified Neural Framework for Learning Hierarchical World Models for Perception and Planning
Authors:
Rajesh P. N. Rao,
Dimitrios C. Gklezakos,
Vishwas Sathish
Abstract:
Predictive coding has emerged as a prominent model of how the brain learns through predictions, anticipating the importance accorded to predictive learning in recent AI architectures such as transformers. Here we propose a new framework for predictive coding called active predictive coding which can learn hierarchical world models and solve two radically different open problems in AI: (1) how do w…
▽ More
Predictive coding has emerged as a prominent model of how the brain learns through predictions, anticipating the importance accorded to predictive learning in recent AI architectures such as transformers. Here we propose a new framework for predictive coding called active predictive coding which can learn hierarchical world models and solve two radically different open problems in AI: (1) how do we learn compositional representations, e.g., part-whole hierarchies, for equivariant vision? and (2) how do we solve large-scale planning problems, which are hard for traditional reinforcement learning, by composing complex action sequences from primitive policies? Our approach exploits hypernetworks, self-supervised learning and reinforcement learning to learn hierarchical world models that combine task-invariant state transition networks and task-dependent policy networks at multiple abstraction levels. We demonstrate the viability of our approach on a variety of vision datasets (MNIST, FashionMNIST, Omniglot) as well as on a scalable hierarchical planning problem. Our results represent, to our knowledge, the first demonstration of a unified solution to the part-whole learning problem posed by Hinton, the nested reference frames problem posed by Hawkins, and the integrated state-action hierarchy learning problem in reinforcement learning.
△ Less
Submitted 23 October, 2022;
originally announced October 2022.
-
TransLIST: A Transformer-Based Linguistically Informed Sanskrit Tokenizer
Authors:
Jivnesh Sandhan,
Rathin Singha,
Narein Rao,
Suvendu Samanta,
Laxmidhar Behera,
Pawan Goyal
Abstract:
Sanskrit Word Segmentation (SWS) is essential in making digitized texts available and in deploying downstream tasks. It is, however, non-trivial because of the sandhi phenomenon that modifies the characters at the word boundaries, and needs special treatment. Existing lexicon driven approaches for SWS make use of Sanskrit Heritage Reader, a lexicon-driven shallow parser, to generate the complete c…
▽ More
Sanskrit Word Segmentation (SWS) is essential in making digitized texts available and in deploying downstream tasks. It is, however, non-trivial because of the sandhi phenomenon that modifies the characters at the word boundaries, and needs special treatment. Existing lexicon driven approaches for SWS make use of Sanskrit Heritage Reader, a lexicon-driven shallow parser, to generate the complete candidate solution space, over which various methods are applied to produce the most valid solution. However, these approaches fail while encountering out-of-vocabulary tokens. On the other hand, purely engineering methods for SWS have made use of recent advances in deep learning, but cannot make use of the latent word information on availability.
To mitigate the shortcomings of both families of approaches, we propose Transformer based Linguistically Informed Sanskrit Tokenizer (TransLIST) consisting of (1) a module that encodes the character input along with latent-word information, which takes into account the sandhi phenomenon specific to SWS and is apt to work with partial or no candidate solutions, (2) a novel soft-masked attention to prioritize potential candidate words and (3) a novel path ranking algorithm to rectify the corrupted predictions. Experiments on the benchmark datasets for SWS show that TransLIST outperforms the current state-of-the-art system by an average 7.2 points absolute gain in terms of perfect match (PM) metric. The codebase and datasets are publicly available at https://github.com/rsingha108/TransLIST
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Neural Co-Processors for Restoring Brain Function: Results from a Cortical Model of Grasping
Authors:
Matthew J. Bryan,
Linxing Preston Jiang,
Rajesh P N Rao
Abstract:
Objective: A major challenge in designing closed-loop brain-computer interfaces is finding optimal stimulation patterns as a function of ongoing neural activity for different subjects and objectives. Approach: To achieve goal-directed closed-loop neurostimulation, we propose "neural co-processors" which use artificial neural networks and deep learning to learn optimal closed-loop stimulation polic…
▽ More
Objective: A major challenge in designing closed-loop brain-computer interfaces is finding optimal stimulation patterns as a function of ongoing neural activity for different subjects and objectives. Approach: To achieve goal-directed closed-loop neurostimulation, we propose "neural co-processors" which use artificial neural networks and deep learning to learn optimal closed-loop stimulation policies, shaping neural activity and bridging injured neural circuits for targeted repair and rehabilitation. The co-processor adapts the stimulation policy as the biological circuit itself adapts to the stimulation, achieving a form of brain-device co-adaptation. Here we use simulations to lay the groundwork for future in vivo tests of neural co-processors. We leverage a cortical model of grasping, to which we applied various forms of simulated lesions, allowing us to develop the critical learning algorithms and study adaptations to non-stationarity. Main results: Our simulations show the ability of a neural co-processor to learn a stimulation policy using a supervised learning approach, and to adapt that policy as the underlying brain and sensors change. Our co-processor successfully co-adapted with the simulated brain to accomplish the reach-and-grasp task after a variety of lesions were applied, achieving recovery towards healthy function. Significance: Our results provide the first proof-of-concept demonstration of a co-processor for adaptive activity-dependent closed-loop neurostimulation, optimizing for a rehabilitation goal. While a gap remains between simulations and applications, our results provide insights on how co-processors may be developed for learning complex adaptive stimulation policies for a variety of neural rehabilitation and neuroprosthetic applications.
△ Less
Submitted 20 March, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Enabling Autonomous Electron Microscopy for Networked Computation and Steering
Authors:
Anees Al-Najjar,
Nageswara S. V. Rao,
Ramanan Sankaran,
Maxim Ziatdinov,
Debangshu Mukherjee,
Olga Ovchinnikova,
Kevin Roccapriore,
Andrew R. Lupini,
Sergei V. Kalinin
Abstract:
Advanced electron microscopy workflows require an ecosystem of microscope instruments and computing systems possibly located at different sites to conduct remotely steered and automated experiments. Current workflow executions involve manual operations for steering and measurement tasks, which are typically performed from control workstations co-located with microscopes; consequently, their operat…
▽ More
Advanced electron microscopy workflows require an ecosystem of microscope instruments and computing systems possibly located at different sites to conduct remotely steered and automated experiments. Current workflow executions involve manual operations for steering and measurement tasks, which are typically performed from control workstations co-located with microscopes; consequently, their operational tempo and effectiveness are limited. We propose an approach based on separate data and control channels for such an ecosystem of Scanning Transmission Electron Microscopes (STEM) and computing systems, for which no general solutions presently exist, unlike the neutron and light source instruments. We demonstrate automated measurement transfers and remote steering of Nion STEM physical instruments over site networks. We propose a Virtual Infrastructure Twin (VIT) of this ecosystem, which is used to develop and test our steering software modules without requiring access to the physical instrument infrastructure. Additionally, we develop a VIT for a multiple laboratory scenario, which illustrates the applicability of this approach to ecosystems connected over wide-area networks, for the development and testing of software modules and their later field deployment.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Globular Cluster UVIT legacy Survey (GlobUleS) III. Omega Centauri in Far-Ultraviolet
Authors:
Deepthi S. Prabhu,
Annapurni Subramaniam,
Snehalata Sahu,
Chul Chung,
Nathan W. C. Leigh,
Emanuele Dalessandro,
Sourav Chatterjee,
N. Kameswara Rao,
Michael Shara,
Patrick Cote,
Samyaday Choudhury,
Gajendra Pandey,
Aldo A. R. Valcarce,
Gaurav Singh,
Joesph E. Postma,
Sharmila Rani,
Avrajit Bandyopadhyay,
Aaron M. Geller,
John Hutchings,
Thomas Puzia,
Mirko Simunovic,
Young-Jong Sohn,
Sivarani Thirupathi,
Ramakant Singh Yadav
Abstract:
We present the first comprehensive study of the most massive globular cluster Omega Centauri in the far-ultraviolet (FUV) extending from the center to ~ 28% of the tidal radius using the Ultraviolet Imaging Telescope aboard AstroSat. A comparison of the FUV-optical color-magnitude diagrams with available canonical models reveals that the horizontal branch (HB) stars bluer than the knee (hHBs) and…
▽ More
We present the first comprehensive study of the most massive globular cluster Omega Centauri in the far-ultraviolet (FUV) extending from the center to ~ 28% of the tidal radius using the Ultraviolet Imaging Telescope aboard AstroSat. A comparison of the FUV-optical color-magnitude diagrams with available canonical models reveals that the horizontal branch (HB) stars bluer than the knee (hHBs) and the white dwarfs (WDs) are fainter in the FUV by ~ 0.5 mag than model predictions. They are also fainter than their counterparts in M13, another massive cluster. We simulated HB with at least five subpopulations, including three He-rich populations with a substantial He enrichment of Y up to 0.43 dex, to reproduce the observed FUV distribution. We find the He-rich younger subpopulations to be radially more segregated than the He-normal older ones, suggesting an in-situ enrichment from older generations. The Omega Cen hHBs span the same effective temperature range as their M13 counterparts, but some have smaller radii and lower luminosities. This may suggest that a fraction of Omega Cen hHBs are less massive than those of M13, similar to the result derived from earlier spectroscopic studies of outer extreme HB stars. The WDs in Omega Cen and M13 have similar luminosity-radius-effective temperature parameters, and 0.44 - 0.46 M$_\odot$ He-core WD model tracks evolving from progenitors with Y = 0.4 dex are found to fit the majority of these. This study provides constraints on the formation models of Omega Cen based on the estimated range in age, [Fe/H] and Y (in particular), for the HB stars.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
A roadmap for edge computing enabled automated multidimensional transmission electron microscopy
Authors:
Debangshu Mukherjee,
Kevin M. Roccapriore,
Anees Al-Najjar,
Ayana Ghosh,
Jacob D. Hinkle,
Andrew R. Lupini,
Rama K. Vasudevan,
Sergei V. Kalinin,
Olga S. Ovchinnikova,
Maxim A. Ziatdinov,
Nageswara S. Rao
Abstract:
The advent of modern, high-speed electron detectors has made the collection of multidimensional hyperspectral transmission electron microscopy datasets, such as 4D-STEM, a routine. However, many microscopists find such experiments daunting since such datasets' analysis, collection, long-term storage, and networking remain challenging. Some common issues are the large and unwieldy size of the said…
▽ More
The advent of modern, high-speed electron detectors has made the collection of multidimensional hyperspectral transmission electron microscopy datasets, such as 4D-STEM, a routine. However, many microscopists find such experiments daunting since such datasets' analysis, collection, long-term storage, and networking remain challenging. Some common issues are the large and unwieldy size of the said datasets, often running into several gigabytes, non-standardized data analysis routines, and a lack of clarity about the computing and network resources needed to utilize the electron microscope fully. However, the existing computing and networking bottlenecks introduce significant penalties in each step of these experiments, and thus, real-time analysis-driven automated experimentation for multidimensional TEM is exceptionally challenging. One solution is integrating microscopy with edge computing, where moderately powerful computational hardware performs the preliminary analysis before handing off the heavier computation to HPC systems. In this perspective, we trace the roots of computation in modern electron microscopy, demonstrate deep learning experiments running on an edge system, and discuss the networking requirements for tying together microscopes, edge computers, and HPC systems.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Effects of Solar Activity, Solar Insolation and the Lower Atmospheric Dust on the Martian Thermosphere
Authors:
N. V. Rao,
V. Leelavathi,
Ch. Yaswanth,
Anil Bhardwaj,
S. V. B. Rao
Abstract:
A diagnosis of the Ar densities measured by the Neutral Gas and Ion Mass Spectrometer aboard the Mars Atmosphere and Volatile EvolutioN (MAVEN) and the temperatures derived from these densities shows that solar activity, solar insolation, and the lower atmospheric dust are the dominant forcings of the Martian thermosphere. A methodology, based on multiple linear regression analysis, is developed t…
▽ More
A diagnosis of the Ar densities measured by the Neutral Gas and Ion Mass Spectrometer aboard the Mars Atmosphere and Volatile EvolutioN (MAVEN) and the temperatures derived from these densities shows that solar activity, solar insolation, and the lower atmospheric dust are the dominant forcings of the Martian thermosphere. A methodology, based on multiple linear regression analysis, is developed to quantify the contributions of the dominant forcings to the densities and temperatures. The results of the present study show that a 100 sfu (solar flux units) change in the solar activity results in approx. 136 K corresponding change in the thermospheric temperatures. The solar insolation constrains the seasonal, latitudinal, and diurnal variations to be interdependent. Diurnal variation dominates the solar insolation variability, followed by the latitudinal and seasonal variations. Both the global and regional dust storms lead to considerable enhancements in the densities and temperatures of the Martian thermosphere. Using past data of the solar fluxes and the dust optical depths, the state of the Martian thermosphere is extrapolated back to Martian year (MY) 24. While the global dust storms of MY 25, MY 28 and MY 34 raise the thermospheric temperatures by approx. 22-38 K, the regional dust storm of MY 34 leads to approx. 15 K warming. Dust driven thermospheric temperatures alone can enhance the hydrogen escape fluxes by 1.67-2.14 times compared to those without the dust. Dusts effects are relatively significant for global dust storms that occur in solar minimum compared to those that occur in solar maximum.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
Search for relativistic fractionally charged particles in space
Authors:
DAMPE Collaboration,
F. Alemanno,
C. Altomare,
Q. An,
P. Azzarello,
F. C. T. Barbato,
P. Bernardini,
X. J. Bi,
M. S. Cai,
E. Casilli,
E. Catanzani,
J. Chang,
D. Y. Chen,
J. L. Chen,
Z. F. Chen,
M. Y. Cui,
T. S. Cui,
Y. X. Cui,
H. T. Dai,
A. De-Benedittis,
I. De Mitri,
F. de Palma,
M. Deliyergiyev,
A. Di Giovanni,
M. Di Santo
, et al. (126 additional authors not shown)
Abstract:
More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been…
▽ More
More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been few searches for FCPs in cosmic rays carried out in orbit other than AMS-01 flown by a space shuttle and BESS by a balloon at the top of the atmosphere. In this study, we conduct an FCP search in space based on on-orbit data obtained using the DArk Matter Particle Explorer (DAMPE) satellite over a period of five years. Unlike underground experiments, which require an FCP energy of the order of hundreds of GeV, our FCP search starts at only a few GeV. An upper limit of $6.2\times 10^{-10}~~\mathrm{cm^{-2}sr^{-1} s^{-1}}$ is obtained for the flux. Our results demonstrate that DAMPE exhibits higher sensitivity than experiments of similar types by three orders of magnitude that more stringently restricts the conditions for the existence of FCP in primary cosmic rays.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
Coexistent quantum channel characterization using spectrally resolved Bayesian quantum process tomography
Authors:
Joseph C. Chapman,
Joseph M. Lukens,
Muneer Alshowkan,
Nageswara Rao,
Brian T. Kirby,
Nicholas A. Peters
Abstract:
The coexistence of quantum and classical signals over the same optical fiber with minimal degradation of the transmitted quantum information is critical for operating large-scale quantum networks over the existing communications infrastructure. Here, we systematically characterize the quantum channel that results from simultaneously distributing approximate single-photon polarization-encoded qubit…
▽ More
The coexistence of quantum and classical signals over the same optical fiber with minimal degradation of the transmitted quantum information is critical for operating large-scale quantum networks over the existing communications infrastructure. Here, we systematically characterize the quantum channel that results from simultaneously distributing approximate single-photon polarization-encoded qubits and classical light of varying intensities through fiber-optic channels of up to 15~km. Using spectrally resolved quantum process tomography with a Bayesian reconstruction method we developed, we estimate the full quantum channel from experimental photon counting data, both with and without classical background. Furthermore, although we find the exact channel description to be a weak function of the pump polarization, we nevertheless show that the coexistent fiber-based quantum channel has high process fidelity with an ideal depolarizing channel when the noise is dominated by Raman scattering. These results provide a basis for the future development of quantum repeater designs and quantum error correcting codes for real-world channels and inform models used in the analysis and simulation of quantum networks.
△ Less
Submitted 16 March, 2023; v1 submitted 30 August, 2022;
originally announced August 2022.
-
ECP SOLLVE: Validation and Verification Testsuite Status Update and Compiler Insight for OpenMP
Authors:
Thomas Huber,
Swaroop Pophale,
Nolan Baker,
Michael Carr,
Nikhil Rao,
Jaydon Reap,
Kristina Holsapple,
Joshua Hoke Davis,
Tobias Burnus,
Seyong Lee,
David E. Bernholdt,
Sunita Chandrasekaran
Abstract:
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of…
▽ More
The OpenMP language continues to evolve with every new specification release, as does the need to validate and verify the new features that have been introduced. With the release of OpenMP 5.0 and OpenMP 5.1, plenty of new target offload and host-based features have been introduced to the programming model. While OpenMP continues to grow in maturity, there is an observable growth in the number of compiler and hardware vendors that support OpenMP. In this manuscript, we focus on evaluating the conformity and implementation progress of various compiler vendors such as Cray, IBM, GNU, Clang/LLVM, NVIDIA, Intel and AMD. We specifically address the 4.5, 5.0, and 5.1 versions of the specification.
△ Less
Submitted 14 November, 2022; v1 submitted 28 August, 2022;
originally announced August 2022.
-
Blending type Approximations by Kantorovich variant of $α$-Schurer operators
Authors:
Nadeem Rao,
Mamta Rani,
Adem Kiliçman,
Pradeep Malik,
Mohammad Ayman-Mursaleen
Abstract:
In the present manuscript, we present a new sequence of operators, $i.e.$, $α$-Bernstein-Schurer-Kantorovich operators depending on two parameters $α\in[0,1]$ and $ρ>0$ for one and two variables to approximate measurable functions on $[0: 1+q], q>0$. Next, we give basic results and discuss the rapidity of convergence and order of approximation for univariate and bivariate of these sequences in the…
▽ More
In the present manuscript, we present a new sequence of operators, $i.e.$, $α$-Bernstein-Schurer-Kantorovich operators depending on two parameters $α\in[0,1]$ and $ρ>0$ for one and two variables to approximate measurable functions on $[0: 1+q], q>0$. Next, we give basic results and discuss the rapidity of convergence and order of approximation for univariate and bivariate of these sequences in their respective sections. Further, Graphical and numerical analysis are presented. Moreover, local and global approximation properties are discussed in terms of first and second order modulus of smoothness, Peetre's K-functional and weight functions for these sequences in different spaces of functions.
△ Less
Submitted 21 August, 2022;
originally announced August 2022.
-
Disentangling the dominant drivers of gravity wave variability in the Martian thermosphere
Authors:
N. V. Rao,
V. Leelavathi,
Ch. Yaswanth,
S. V. B. Rao
Abstract:
In this study, we extracted the amplitudes of the gravity waves (GWs)from the neutral densities measured in situ by the neutral gas and ion mass spectrometer aboard the Mars atmosphere and volatile evolution mission. The spatial and temporal variabilities of the GWs show that solar activity (the F10.7 cm solar flux corrected for a heliocentric distance of 1.66 AU), solar insolation, and the lower…
▽ More
In this study, we extracted the amplitudes of the gravity waves (GWs)from the neutral densities measured in situ by the neutral gas and ion mass spectrometer aboard the Mars atmosphere and volatile evolution mission. The spatial and temporal variabilities of the GWs show that solar activity (the F10.7 cm solar flux corrected for a heliocentric distance of 1.66 AU), solar insolation, and the lower atmospheric dust are the dominant drivers of the GW variability in the thermosphere. We developed a methodology in which a linear regression analysis has been used to disentangle the complex variabilities of the GWs. The three dominant drivers could account for most of the variability in the GW amplitudes. Variability caused by the sources of GWs and the effects of winds and the global circulation in the mesosphere and lower thermosphere are the other factors that could not be addressed. The results of the present study show that for every 100 sfu increase in the solar activity, the GW amplitudes in the thermosphere decrease by ~9%. Solar insolation drives the diurnal, seasonal and latitudinal variations of ~9%, ~4% and ~6%, respectively. Using the historical data of the dust opacity and solar activity, we estimated the GW amplitudes of the Martian thermosphere from MY 24 to MY 35. The GW amplitudes were significantly reduced during the maximum of solar cycle 23 and were highest in the solar minimum. The global dust storms of MY 25, 28, and 34 lead to significant enhancements in the GW amplitudes.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Fractional Brownian Motion: Local Modulus of Continuity with Refined Almost Sure Upper Bound and First Exit Time from One-sided Barrier
Authors:
Qidi Peng,
Nan Rao
Abstract:
Based on an optimal rate wavelet series representation, we derive a local modulus of continuity result with a refined almost sure upper bound for fractional Brownian motion. \sloppy The obtained upper bound of the small fractional Brownian increments is of order $\mathcal O_{a.s.}\big(|h|^H\sqrt{\log\log |h|^{-1}}\big)$ as $|h|\to0$, and an upper bound of its $p$th moment is provided, for any…
▽ More
Based on an optimal rate wavelet series representation, we derive a local modulus of continuity result with a refined almost sure upper bound for fractional Brownian motion. \sloppy The obtained upper bound of the small fractional Brownian increments is of order $\mathcal O_{a.s.}\big(|h|^H\sqrt{\log\log |h|^{-1}}\big)$ as $|h|\to0$, and an upper bound of its $p$th moment is provided, for any $p>0$. This result fills the gap of the law of iterated logarithm for fractional Brownian motion, where the moments' information of the random multiplier in the upper bound is missing. With this enhanced upper bound and some new results on the distribution of the maximum of fractional Brownian motion, we obtain a new and refined asymptotic estimate of the upper-tail probability for a fractional Brownian motion to first exit from a positive-valued barrier over time $T$, as $T\to+\infty$.
△ Less
Submitted 18 October, 2023; v1 submitted 20 July, 2022;
originally announced July 2022.
-
An Input-Output Feedback Linearization based Exponentially Stable Controller for Multi-UAV Payload Transport
Authors:
Nishanth Rao,
Suresh Sundaram
Abstract:
In this paper, an exponentially stable trajectory tracking controller is proposed for multi-UAV payload transport. The multi-UAV payload system has a 2-DOF magnetic spherical joint between the UAVs and the vertical rigid links of the payload frame, so the UAVs can roll or pitch freely. These vertical links are rigidly attached to the payload and cannot move. An input-output feedback linearized mod…
▽ More
In this paper, an exponentially stable trajectory tracking controller is proposed for multi-UAV payload transport. The multi-UAV payload system has a 2-DOF magnetic spherical joint between the UAVs and the vertical rigid links of the payload frame, so the UAVs can roll or pitch freely. These vertical links are rigidly attached to the payload and cannot move. An input-output feedback linearized model is derived for the complete payload-UAV system along with thrust vectoring control for trajectory tracking of the payload. The theoretical analysis on tracking control laws shows that control law is exponentially stable, thus guaranteeing safe transportation along the desired trajectory. To validate the performance of the proposed control law, the results for a numerical simulation as well as a high-fidelity Gazebo real-time simulation are presented. Next, the robustness of the proposed controller is analyzed against two practical situations: External disturbance on the payload and payload mass uncertainty. The results clearly indicate that the proposed controller is robust and computationally efficient while achieving exponentially stable trajectory tracking.
△ Less
Submitted 10 July, 2022;
originally announced July 2022.
-
Hyper-Universal Policy Approximation: Learning to Generate Actions from a Single Image using Hypernets
Authors:
Dimitrios C. Gklezakos,
Rishi Jha,
Rajesh P. N. Rao
Abstract:
Inspired by Gibson's notion of object affordances in human vision, we ask the question: how can an agent learn to predict an entire action policy for a novel object or environment given only a single glimpse? To tackle this problem, we introduce the concept of Universal Policy Functions (UPFs) which are state-to-action mappings that generalize not only to new goals but most importantly to novel, u…
▽ More
Inspired by Gibson's notion of object affordances in human vision, we ask the question: how can an agent learn to predict an entire action policy for a novel object or environment given only a single glimpse? To tackle this problem, we introduce the concept of Universal Policy Functions (UPFs) which are state-to-action mappings that generalize not only to new goals but most importantly to novel, unseen environments. Specifically, we consider the problem of efficiently learning such policies for agents with limited computational and communication capacity, constraints that are frequently encountered in edge devices. We propose the Hyper-Universal Policy Approximator (HUPA), a hypernetwork-based model to generate small task- and environment-conditional policy networks from a single image, with good generalization properties. Our results show that HUPAs significantly outperform an embedding-based alternative for generated policies that are size-constrained. Although this work is restricted to a simple map-based navigation task, future work includes applying the principles behind HUPAs to learning more general affordances for objects and environments.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Text Enriched Sparse Hyperbolic Graph Convolutional Networks
Authors:
Nurendra Choudhary,
Nikhil Rao,
Karthik Subbian,
Chandan K. Reddy
Abstract:
Heterogeneous networks, which connect informative nodes containing text with different edge types, are routinely used to store and process information in various real-world applications. Graph Neural Networks (GNNs) and their hyperbolic variants provide a promising approach to encode such networks in a low-dimensional latent space through neighborhood aggregation and hierarchical feature extractio…
▽ More
Heterogeneous networks, which connect informative nodes containing text with different edge types, are routinely used to store and process information in various real-world applications. Graph Neural Networks (GNNs) and their hyperbolic variants provide a promising approach to encode such networks in a low-dimensional latent space through neighborhood aggregation and hierarchical feature extraction, respectively. However, these approaches typically ignore metapath structures and the available semantic information. Furthermore, these approaches are sensitive to the noise present in the training data. To tackle these limitations, in this paper, we propose Text Enriched Sparse Hyperbolic Graph Convolution Network (TESH-GCN) to capture the graph's metapath structures using semantic signals and further improve prediction in large heterogeneous graphs. In TESH-GCN, we extract semantic node information, which successively acts as a connection signal to extract relevant nodes' local neighborhood and graph-level metapath features from the sparse adjacency tensor in a reformulated hyperbolic graph convolution layer. These extracted features in conjunction with semantic features from the language model (for robustness) are used for the final downstream task. Experiments on various heterogeneous graph datasets show that our model outperforms the current state-of-the-art approaches by a large margin on the task of link prediction. We also report a reduction in both the training time and model parameters compared to the existing hyperbolic approaches through a reformulated hyperbolic graph convolution. Furthermore, we illustrate the robustness of our model by experimenting with different levels of simulated noise in both the graph structure and text, and also, present a mechanism to explain TESH-GCN's prediction by analyzing the extracted metapaths.
△ Less
Submitted 7 July, 2022; v1 submitted 5 July, 2022;
originally announced July 2022.
-
Tangle of Spin Double Helices in the Honeycomb Kitaev-$Γ$ Model
Authors:
Jheng-Wei Li,
Nihal Rao,
Jan von Delft,
Lode Pollet,
Ke Liu
Abstract:
We investigate the ground-state nature of the honeycomb Kitaev-$Γ$ model in the material-relevant parameter regime through a combination of classical and quantum simulations. The classical model is imprinted with a tangle of highly structured spin double helices. This helix tangle exhibits $18$ inequivalent helical axes and features a spontaneous periodicity anisotropy and a ${\rm sgn}(Γ)$-determi…
▽ More
We investigate the ground-state nature of the honeycomb Kitaev-$Γ$ model in the material-relevant parameter regime through a combination of classical and quantum simulations. The classical model is imprinted with a tangle of highly structured spin double helices. This helix tangle exhibits $18$ inequivalent helical axes and features a spontaneous periodicity anisotropy and a ${\rm sgn}(Γ)$-determined chirality pattern. Infinite PEPS simulations with clusters up to $36$ sites identify hallmarks of this many-body order in the quantum spin-$1/2$ model. Our findings provide a fresh perspective of the Kitaev-$Γ$ model and enrich the physics of Kitaev magnetism.
△ Less
Submitted 24 May, 2023; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Recursive Neural Programs: Variational Learning of Image Grammars and Part-Whole Hierarchies
Authors:
Ares Fisher,
Rajesh P. N. Rao
Abstract:
Human vision involves parsing and representing objects and scenes using structured representations based on part-whole hierarchies. Computer vision and machine learning researchers have recently sought to emulate this capability using capsule networks, reference frames and active predictive coding, but a generative model formulation has been lacking. We introduce Recursive Neural Programs (RNPs),…
▽ More
Human vision involves parsing and representing objects and scenes using structured representations based on part-whole hierarchies. Computer vision and machine learning researchers have recently sought to emulate this capability using capsule networks, reference frames and active predictive coding, but a generative model formulation has been lacking. We introduce Recursive Neural Programs (RNPs), which, to our knowledge, is the first neural generative model to address the part-whole hierarchy learning problem. RNPs model images as hierarchical trees of probabilistic sensory-motor programs that recursively reuse learned sensory-motor primitives to model an image within different reference frames, forming recursive image grammars. We express RNPs as structured variational autoencoders (sVAEs) for inference and sampling, and demonstrate parts-based parsing, sampling and one-shot transfer learning for MNIST, Omniglot and Fashion-MNIST datasets, demonstrating the model's expressive power. Our results show that RNPs provide an intuitive and explainable way of composing objects and scenes, allowing rich compositionality and intuitive interpretations of objects in terms of part-whole hierarchies.
△ Less
Submitted 25 June, 2022; v1 submitted 16 June, 2022;
originally announced June 2022.
-
Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving Product Search
Authors:
Chandan K. Reddy,
Lluís Màrquez,
Fran Valero,
Nikhil Rao,
Hugo Zaragoza,
Sambaran Bandyopadhyay,
Arnab Biswas,
Anlu Xing,
Karthik Subbian
Abstract:
Improving the quality of search results can significantly enhance users experience and engagement with search engines. In spite of several recent advancements in the fields of machine learning and data mining, correctly classifying items for a particular user search query has been a long-standing challenge, which still has a large room for improvement. This paper introduces the "Shopping Queries D…
▽ More
Improving the quality of search results can significantly enhance users experience and engagement with search engines. In spite of several recent advancements in the fields of machine learning and data mining, correctly classifying items for a particular user search query has been a long-standing challenge, which still has a large room for improvement. This paper introduces the "Shopping Queries Dataset", a large dataset of difficult Amazon search queries and results, publicly released with the aim of fostering research in improving the quality of search results. The dataset contains around 130 thousand unique queries and 2.6 million manually labeled (query,product) relevance judgements. The dataset is multilingual with queries in English, Japanese, and Spanish. The Shopping Queries Dataset is being used in one of the KDDCup'22 challenges. In this paper, we describe the dataset and present three evaluation tasks along with baseline results: (i) ranking the results list, (ii) classifying product results into relevance categories, and (iii) identifying substitute products for a given query. We anticipate that this data will become the gold standard for future research in the topic of product search.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Analysis of Learner Independent Variables for Estimating Assessment Items Difficulty Level
Authors:
Shilpi Banerjee,
N. J. Rao
Abstract:
The quality of assessment determines the quality of learning, and is characterized by validity, reliability and difficulty. Mastery of learning is generally represented by the difficulty levels of assessment items. A very large number of variables are identified in the literature to measure the difficulty level. These variables, which are not completely independent of one another, are categorized…
▽ More
The quality of assessment determines the quality of learning, and is characterized by validity, reliability and difficulty. Mastery of learning is generally represented by the difficulty levels of assessment items. A very large number of variables are identified in the literature to measure the difficulty level. These variables, which are not completely independent of one another, are categorized into learner dependent, learner independent, generic, non-generic and score based. This research proposes a model for predicting the difficulty level of assessment items in engineering courses using learner independent and generic variables. An ordinal regression model is developed for predicting the difficulty level, and uses six variables including three stimuli variables (item presentation, usage of technical notations and number of resources), two content related variables (number of concepts and procedures) and one task variable (number of conditions). Experimental results from three engineering courses provide around 80% accuracy in classification of items using the proposed model.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Learning Backward Compatible Embeddings
Authors:
Weihua Hu,
Rajas Bansal,
Kaidi Cao,
Nikhil Rao,
Karthik Subbian,
Jure Leskovec
Abstract:
Embeddings, low-dimensional vector representation of objects, are fundamental in building modern machine learning systems. In industrial settings, there is usually an embedding team that trains an embedding model to solve intended tasks (e.g., product recommendation). The produced embeddings are then widely consumed by consumer teams to solve their unintended tasks (e.g., fraud detection). However…
▽ More
Embeddings, low-dimensional vector representation of objects, are fundamental in building modern machine learning systems. In industrial settings, there is usually an embedding team that trains an embedding model to solve intended tasks (e.g., product recommendation). The produced embeddings are then widely consumed by consumer teams to solve their unintended tasks (e.g., fraud detection). However, as the embedding model gets updated and retrained to improve performance on the intended task, the newly-generated embeddings are no longer compatible with the existing consumer models. This means that historical versions of the embeddings can never be retired or all consumer teams have to retrain their models to make them compatible with the latest version of the embeddings, both of which are extremely costly in practice. Here we study the problem of embedding version updates and their backward compatibility. We formalize the problem where the goal is for the embedding team to keep updating the embedding version, while the consumer teams do not have to retrain their models. We develop a solution based on learning backward compatible embeddings, which allows the embedding model version to be updated frequently, while also allowing the latest version of the embedding to be quickly transformed into any backward compatible historical version of it, so that consumer teams do not have to retrain their models. Under our framework, we explore six methods and systematically evaluate them on a real-world recommender system application. We show that the best method, which we call BC-Aligner, maintains backward compatibility with existing unintended tasks even after multiple model version updates. Simultaneously, BC-Aligner achieves the intended task performance similar to the embedding model that is solely optimized for the intended task.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
An efficient Deep Spatio-Temporal Context Aware decision Network (DST-CAN) for Predictive Manoeuvre Planning
Authors:
Jayabrata Chowdhury,
Suresh Sundaram,
Nishanth Rao,
Narasimhan Sundararajan
Abstract:
To ensure the safety and efficiency of its maneuvers, an Autonomous Vehicle (AV) should anticipate the future intentions of surrounding vehicles using its sensor information. If an AV can predict its surrounding vehicles' future trajectories, it can make safe and efficient manoeuvre decisions. In this paper, we present such a Deep Spatio-Temporal Context-Aware decision Network (DST-CAN) model for…
▽ More
To ensure the safety and efficiency of its maneuvers, an Autonomous Vehicle (AV) should anticipate the future intentions of surrounding vehicles using its sensor information. If an AV can predict its surrounding vehicles' future trajectories, it can make safe and efficient manoeuvre decisions. In this paper, we present such a Deep Spatio-Temporal Context-Aware decision Network (DST-CAN) model for predictive manoeuvre planning of AVs. A memory neuron network is used to predict future trajectories of its surrounding vehicles. The driving environment's spatio-temporal information (past, present, and predicted future trajectories) are embedded into a context-aware grid. The proposed DST-CAN model employs these context-aware grids as inputs to a convolutional neural network to understand the spatial relationships between the vehicles and determine a safe and efficient manoeuvre decision. The DST-CAN model also uses information of human driving behavior on a highway. Performance evaluation of DST-CAN has been carried out using two publicly available NGSIM US-101 and I-80 datasets. Also, rule-based ground truth decisions have been compared with those generated by DST-CAN. The results clearly show that DST-CAN can make much better decisions with 3-sec of predicted trajectories of neighboring vehicles compared to currently existing methods that do not use this prediction.
△ Less
Submitted 8 July, 2024; v1 submitted 20 May, 2022;
originally announced May 2022.
-
Globular Clusters UVIT Legacy Survey (GlobULeS) I. FUV-optical Color-Magnitude Diagrams for Eight Globular Clusters
Authors:
Snehalata Sahu,
Annapurni Subramaniam,
Gaurav Singh,
Ramakant Yadav,
Aldo R. Valcarce,
Samyaday Choudhury,
Sharmila Rani,
Deepthi S. Prabhu,
Chul Chung,
Patrick Côté,
Nathan Leigh,
Aaron M. Geller,
Sourav Chatterjee,
N. Kameswara Rao,
Avrajit Bandyopadhyay,
Michael Shara,
Emanuele Dalessandro,
Gajendra Pandey,
Joesph E. Postma,
John Hutchings,
Mirko Simunovic,
Peter B. Stetson,
Sivarani Thirupathi,
Thomas Puzia,
Young-Jong Sohn
Abstract:
We present the first results of eight Globular Clusters (GCs) from the AstroSat/UVIT Legacy Survey program GlobULeS based on the observations carried out in two FUV filters (F148W and F169M). The FUV-optical and FUV-FUV color-magnitude diagrams (CMDs) of GCs with the proper motion membership were constructed by combining the UVIT data with HST UV Globular Cluster Survey (HUGS) data for inner regio…
▽ More
We present the first results of eight Globular Clusters (GCs) from the AstroSat/UVIT Legacy Survey program GlobULeS based on the observations carried out in two FUV filters (F148W and F169M). The FUV-optical and FUV-FUV color-magnitude diagrams (CMDs) of GCs with the proper motion membership were constructed by combining the UVIT data with HST UV Globular Cluster Survey (HUGS) data for inner regions and Gaia Early Data Release (EDR3) for regions outside the HST's field. We detect sources as faint as F148W $\sim$ 23.5~mag which are classified based on their locations in CMDs by overlaying stellar evolutionary models. The CMDs of 8 GCs are combined with the previous UVIT studies of 3 GCs to create stacked FUV-optical CMDs to highlight the features/peculiarities found in the different evolutionary sequences. The FUV (F148W) detected stellar populations of 11 GCs comprises 2,816 Horizontal Branch (HB) stars (190 Extreme HB candidates), 46 post-HB (pHB), 221 Blue Straggler Stars (BSS), and 107 White Dwarf (WD) candidates. We note that the blue HB color extension obtained from F148W$-$G color and the number of FUV detected EHB candidates are strongly correlated with the maximum internal Helium (He) variation within each GC, suggesting that the FUV-optical plane is the most sensitive to He abundance variations in the HB. We discuss the potential science cases that will be addressed using these catalogues including HB morphologies, BSSs, pHB, and, WD stars.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
Optimal resource allocation for flexible-grid entanglement distribution networks
Authors:
J. Alnas,
M. Alshowkan,
N. S. V. Rao,
N. A. Peters,
J. M. Lukens
Abstract:
We use a genetic algorithm (GA) as a design aid for determining the optimal provisioning of entangled photon spectrum in flex-grid quantum networks with arbitrary numbers of channels and users. After introducing a general model for entanglement distribution based on frequency-polarization hyperentangled biphotons, we derive upper bounds on fidelity and entangled bit rate for networks comprising on…
▽ More
We use a genetic algorithm (GA) as a design aid for determining the optimal provisioning of entangled photon spectrum in flex-grid quantum networks with arbitrary numbers of channels and users. After introducing a general model for entanglement distribution based on frequency-polarization hyperentangled biphotons, we derive upper bounds on fidelity and entangled bit rate for networks comprising one-to-one user connections. Simple conditions based on user detector quality and link efficiencies are found that determine whether entanglement is possible. We successfully apply a GA to find optimal resource allocations in four different representative network scenarios and validate features of our model experimentally in a quantum local area network in deployed fiber. Our results show promise for the rapid design of large-scale entanglement distribution networks.
△ Less
Submitted 13 April, 2022;
originally announced April 2022.
-
Comments on Comments: Where Code Review and Documentation Meet
Authors:
Nikitha Rao,
Jason Tsay,
Martin Hirzel,
Vincent J. Hellendoorn
Abstract:
A central function of code review is to increase understanding; helping reviewers understand a code change aids in knowledge transfer and finding bugs. Comments in code largely serve a similar purpose, helping future readers understand the program. It is thus natural to study what happens when these two forms of understanding collide. We ask: what documentation-related comments do reviewers make a…
▽ More
A central function of code review is to increase understanding; helping reviewers understand a code change aids in knowledge transfer and finding bugs. Comments in code largely serve a similar purpose, helping future readers understand the program. It is thus natural to study what happens when these two forms of understanding collide. We ask: what documentation-related comments do reviewers make and how do they affect understanding of the contribution? We analyze ca.700K review comments on 2,000 (Java and Python) GitHub projects, and propose several filters to identify which comments are likely to be either in response to a change in documentation and/or call for such a change. We identify 65K such cases. We next develop a taxonomy of the reviewer intents behind such "comments on comments". We find that achieving a shared understanding of the code is key: reviewer comments most often focused on clarification, followed by pointing out issues to fix, such as typos and outdated comments. Curiously, clarifying comments were frequently suggested (often verbatim) by the reviewer, indicating a desire to persist their understanding acquired during code review. We conclude with a discussion of implications of our comments-on-comments dataset for research on improving code review, including the potential benefits for automating code review.
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
Quantum Networks for High Energy Physics
Authors:
Andrei Derevianko,
Eden Figueroa,
Julián MartÍnez-Rincón,
Inder Monga,
Andrei Nomerotski,
Cristián H. Peña,
Nicholas A. Peters,
Raphael Pooser,
Nageswara Rao,
Anze Slosar,
Panagiotis Spentzouris,
Maria Spiropulu,
Paul Stankus,
Wenji Wu,
Si Xie
Abstract:
Quantum networks of quantum objects promise to be exponentially more powerful than the objects considered independently. To live up to this promise will require the development of error mitigation and correction strategies to preserve quantum information as it is initialized, stored, transported, utilized, and measured. The quantum information could be encoded in discrete variables such as qubits,…
▽ More
Quantum networks of quantum objects promise to be exponentially more powerful than the objects considered independently. To live up to this promise will require the development of error mitigation and correction strategies to preserve quantum information as it is initialized, stored, transported, utilized, and measured. The quantum information could be encoded in discrete variables such as qubits, in continuous variables, or anything in-between. Quantum computational networks promise to enable simulation of physical phenomena of interest to the HEP community. Quantum sensor networks promise new measurement capability to test for new physics and improve upon existing measurements of fundamental constants. Such networks could exist at multiple scales from the nano-scale to a global-scale quantum network.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Quantum counterfactuality with identical particles
Authors:
Vinod N. Rao,
Anindita Banerjee,
R. Srikanth
Abstract:
Quantum self-interference enables the counterfactual transmission of information, whereby the transmitted bits involve no particles traveling through the channel. In this work, we show how counterfactuality can be realized even when the self interference is replaced by interference between identical particles. Interestingly, the facet of indistinguishability called forth here is associated with fi…
▽ More
Quantum self-interference enables the counterfactual transmission of information, whereby the transmitted bits involve no particles traveling through the channel. In this work, we show how counterfactuality can be realized even when the self interference is replaced by interference between identical particles. Interestingly, the facet of indistinguishability called forth here is associated with first-order coherence, and is different from the usual notion of indistinguishability associated with the (anti-)commutation relations of mode operators. From an experimental perspective, the simplest implementation of the proposed idea can be realized by slight modifications to existing protocols for differential-phase-shift quantum key distribution or interaction-free measurement.
△ Less
Submitted 17 October, 2023; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Task-Agnostic Graph Explanations
Authors:
Yaochen Xie,
Sumeet Katariya,
Xianfeng Tang,
Edward Huang,
Nikhil Rao,
Karthik Subbian,
Shuiwang Ji
Abstract:
Graph Neural Networks (GNNs) have emerged as powerful tools to encode graph-structured data. Due to their broad applications, there is an increasing need to develop tools to explain how GNNs make decisions given graph-structured data. Existing learning-based GNN explanation approaches are task-specific in training and hence suffer from crucial drawbacks. Specifically, they are incapable of produci…
▽ More
Graph Neural Networks (GNNs) have emerged as powerful tools to encode graph-structured data. Due to their broad applications, there is an increasing need to develop tools to explain how GNNs make decisions given graph-structured data. Existing learning-based GNN explanation approaches are task-specific in training and hence suffer from crucial drawbacks. Specifically, they are incapable of producing explanations for a multitask prediction model with a single explainer. They are also unable to provide explanations in cases where the GNN is trained in a self-supervised manner, and the resulting representations are used in future downstream tasks. To address these limitations, we propose a Task-Agnostic GNN Explainer (TAGE) that is independent of downstream models and trained under self-supervision with no knowledge of downstream tasks. TAGE enables the explanation of GNN embedding models with unseen downstream tasks and allows efficient explanation of multitask models. Our extensive experiments show that TAGE can significantly speed up the explanation efficiency by using the same model to explain predictions for multiple downstream tasks while achieving explanation quality as good as or even better than current state-of-the-art GNN explanation approaches. Our code is pubicly available as part of the DIG library at https://github.com/divelab/DIG/tree/main/dig/xgraph/TAGE/.
△ Less
Submitted 23 September, 2022; v1 submitted 16 February, 2022;
originally announced February 2022.
-
Integrated Decision Control Approach for Cooperative Safety-Critical Payload Transport in a Cluttered Environment
Authors:
Nishanth Rao,
Suresh Sundaram
Abstract:
In this paper, the problem of coordinated transportation of heavy payload by a team of UAVs in a cluttered environment is addressed. The payload is modeled as a rigid body and is assumed to track a pre-computed global flight trajectory from a start point to a goal point. Due to the presence of local dynamic obstacles in the environment, the UAVs must ensure that there is no collision between the p…
▽ More
In this paper, the problem of coordinated transportation of heavy payload by a team of UAVs in a cluttered environment is addressed. The payload is modeled as a rigid body and is assumed to track a pre-computed global flight trajectory from a start point to a goal point. Due to the presence of local dynamic obstacles in the environment, the UAVs must ensure that there is no collision between the payload and these obstacles while ensuring that the payload oscillations are kept minimum. An Integrated Decision Controller (IDC) is proposed, that integrates the optimal tracking control law given by a centralized Model Predictive Controller with safety-critical constraints provided by the Exponential Control Barrier Functions. The entire payload-UAV system is enclosed by a safe convex hull boundary, and the IDC ensures that no obstacle enters this boundary. To evaluate the performance of the IDC, the results for a numerical simulation as well as a high-fidelity Gazebo simulation are presented. An ablation study is conducted to analyze the robustness of the proposed IDC against practical dubieties like noisy state values, relative obstacle safety margin, and payload mass uncertainty. The results clearly show that the IDC achieves both trajectory tracking and obstacle avoidance successfully while restricting the payload oscillations within a safe limit.
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies
Authors:
Dimitrios C. Gklezakos,
Rajesh P. N. Rao
Abstract:
We introduce Active Predictive Coding Networks (APCNs), a new class of neural networks that solve a major problem posed by Hinton and others in the fields of artificial intelligence and brain modeling: how can neural networks learn intrinsic reference frames for objects and parse visual scenes into part-whole hierarchies by dynamically allocating nodes in a parse tree? APCNs address this problem b…
▽ More
We introduce Active Predictive Coding Networks (APCNs), a new class of neural networks that solve a major problem posed by Hinton and others in the fields of artificial intelligence and brain modeling: how can neural networks learn intrinsic reference frames for objects and parse visual scenes into part-whole hierarchies by dynamically allocating nodes in a parse tree? APCNs address this problem by using a novel combination of ideas: (1) hypernetworks are used for dynamically generating recurrent neural networks that predict parts and their locations within intrinsic reference frames conditioned on higher object-level embedding vectors, and (2) reinforcement learning is used in conjunction with backpropagation for end-to-end learning of model parameters. The APCN architecture lends itself naturally to multi-level hierarchical learning and is closely related to predictive coding models of cortical function. Using the MNIST, Fashion-MNIST and Omniglot datasets, we demonstrate that APCNs can (a) learn to parse images into part-whole hierarchies, (b) learn compositional representations, and (c) transfer their knowledge to unseen classes of objects. With their ability to dynamically generate parse trees with part locations for objects, APCNs offer a new framework for explainable AI that leverages advances in deep learning while retaining interpretability and compositionality.
△ Less
Submitted 14 January, 2022;
originally announced January 2022.
-
Predictive Coding Theories of Cortical Function
Authors:
Linxing Preston Jiang,
Rajesh P. N. Rao
Abstract:
Predictive coding is a unifying framework for understanding perception, action and neocortical organization. In predictive coding, different areas of the neocortex implement a hierarchical generative model of the world that is learned from sensory inputs. Cortical circuits are hypothesized to perform Bayesian inference based on this generative model. Specifically, the Rao-Ballard hierarchical pred…
▽ More
Predictive coding is a unifying framework for understanding perception, action and neocortical organization. In predictive coding, different areas of the neocortex implement a hierarchical generative model of the world that is learned from sensory inputs. Cortical circuits are hypothesized to perform Bayesian inference based on this generative model. Specifically, the Rao-Ballard hierarchical predictive coding model assumes that the top-down feedback connections from higher to lower order cortical areas convey predictions of lower-level activities. The bottom-up, feedforward connections in turn convey the errors between top-down predictions and actual activities. These errors are used to correct current estimates of the state of the world and generate new predictions. Through the objective of minimizing prediction errors, predictive coding provides a functional explanation for a wide range of neural responses and many aspects of brain organization.
△ Less
Submitted 18 May, 2023; v1 submitted 18 December, 2021;
originally announced December 2021.
-
On The Effect Of Coding Artifacts On Acoustic Scene Classification
Authors:
Nagashree K. S. Rao,
Nils Peters
Abstract:
Previous DCASE challenges contributed to an increase in the performance of acoustic scene classification systems. State-of-the-art classifiers demand significant processing capabilities and memory which is challenging for resource-constrained mobile or IoT edge devices. Thus, it is more likely to deploy these models on more powerful hardware and classify audio recordings previously uploaded (or st…
▽ More
Previous DCASE challenges contributed to an increase in the performance of acoustic scene classification systems. State-of-the-art classifiers demand significant processing capabilities and memory which is challenging for resource-constrained mobile or IoT edge devices. Thus, it is more likely to deploy these models on more powerful hardware and classify audio recordings previously uploaded (or streamed) from low-power edge devices. In such scenario, the edge device may apply perceptual audio coding to reduce the transmission data rate. This paper explores the effect of perceptual audio coding on the classification performance using a DCASE 2020 challenge contribution [1]. We found that classification accuracy can degrade by up to 57% compared to classifying original (uncompressed) audio. We further demonstrate how lossy audio compression techniques during model training can improve classification accuracy of compressed audio signals even for audio codecs and codec bitrates not included in the training process.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
Direct observation of Jahn-Teller critical dynamics at a charge-order Verwey transition
Authors:
Vinícius Pascotto Gastaldo,
Mala N. Rao,
Alexey Bosak,
Matteo d'Astuto,
Andrea Prodi,
Marine Verseils,
Yannick Klein,
Christophe Bellin,
Luigi Paolasini,
Adilson J. A. de Oliveira,
Edmondo Gilioli,
Samrath Lal Chaplot,
Andrea Gauzzi
Abstract:
By means of diffuse and inelastic x-ray scattering (DS,IXS), we probe directly the charge-ordering (CO) dynamics in the Verwey system (NaMn$_3$)Mn$_4$O$_{12}$, where a peculiar quadruple perovskite structure with no oxygen disorder stabilizes a nearly full Mn$^{3+}$/Mn$^{4+}$ static charge order at $T_{\rm CO}$=175 K concomitant to a commensurate structural modulation with propagation vector…
▽ More
By means of diffuse and inelastic x-ray scattering (DS,IXS), we probe directly the charge-ordering (CO) dynamics in the Verwey system (NaMn$_3$)Mn$_4$O$_{12}$, where a peculiar quadruple perovskite structure with no oxygen disorder stabilizes a nearly full Mn$^{3+}$/Mn$^{4+}$ static charge order at $T_{\rm CO}$=175 K concomitant to a commensurate structural modulation with propagation vector ${\bf q}_{\rm CO}=(\frac{1}{2},\frac{1}{2},0)$. At $T_{\rm CO}$, the IXS spectra unveil a softening of a 35.3 meV phonon at ${\bf q}_{\rm CO}$. Lattice dynamical calculations enable us to attribute this soft phonon to a A$_g$ mode whose polarization matches the Jahn-Teller-like distortion pattern of the structural modulation. This result demonstrates that the Jahn-Teller instability is the driving force of the CO Verwey transition in (NaMn$_3$)Mn$_4$O$_{12}$, thus elucidating a long-standing controversy regarding the mechanism of this transition observed in other mixed-valence systems like magnetite.
△ Less
Submitted 8 December, 2021;
originally announced December 2021.
-
THz Band Channel Measurements and Statistical Modeling for Urban Microcellular Environments
Authors:
Naveed A. Abbasi,
Jorge Gomez-Ponce,
Revanth Kondaveti,
Ashish Kumar,
Eshan Bhagat,
Rakesh N S Rao,
Shadi Abu-Surra,
Gary Xu,
Charlie Zhang,
Andreas F. Molisch
Abstract:
The THz band (0.1-10 THz) has attracted considerable attention for next-generation wireless communications, due to the large amount of available bandwidth that may be key to meet the rapidly increasing data rate requirements. Before deploying a system in this band, a detailed wireless channel analysis is required as the basis for proper design and testing of system implementations. One of the most…
▽ More
The THz band (0.1-10 THz) has attracted considerable attention for next-generation wireless communications, due to the large amount of available bandwidth that may be key to meet the rapidly increasing data rate requirements. Before deploying a system in this band, a detailed wireless channel analysis is required as the basis for proper design and testing of system implementations. One of the most important deployment scenarios of this band is the outdoor microcellular environment, where the Transmitter (Tx) and the Receiver (Rx) have a significant height difference (typically $ \ge 10$ m). In this paper, we present double-directional (i.e., directionally resolved at both link ends) channel measurements in such a microcellular scenario encompassing street canyons and an open square. Measurements are done for a 1 GHz bandwidth between 145-146 GHz and an antenna beamwidth of 13 degree; distances between Tx and Rx are up to 85 m and the Tx is at a height of 11.5 m from the ground. The measurements are analyzed to estimate path loss, shadowing, delay spread, angular spread, and multipath component (MPC) power distribution. These results allow the development of more realistic and detailed THz channel models and system performance assessment.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
Photonic and electronic state interactions in BaTiO3 based Optical Microcavity
Authors:
Jitendra Nath Acharyya,
R. B. Gangineni,
D. Narayana Rao,
G. Vijaya Prakash
Abstract:
The photonic modes mediated absorption dynamics at femtosecond time scales along with the control of spontaneous emission tunability are investigated all-dielectric optical microcavity having BaTiO3 (BTO) as defect layer. The cavity-enhanced transient absorption reveals the dominant excited state absorption (ESA) of both photonic and electronic modes due to strong third-order optical nonlinearity…
▽ More
The photonic modes mediated absorption dynamics at femtosecond time scales along with the control of spontaneous emission tunability are investigated all-dielectric optical microcavity having BaTiO3 (BTO) as defect layer. The cavity-enhanced transient absorption reveals the dominant excited state absorption (ESA) of both photonic and electronic modes due to strong third-order optical nonlinearity influence. Photoluminescence of BTO is found to be guided and tuned by the photonic cavity mode. Such active photonic structures can be envisaged as a potential candidate in nonlinear optics and photonic device applications.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
Advanced Architectures for High-Performance Quantum Networking
Authors:
Muneer Alshowkan,
Philip G. Evans,
Brian P. Williams,
Nageswara S. V. Rao,
Claire E. Marvinney,
Yun-Yi Pai,
Benjamin J. Lawrie,
Nicholas A. Peters,
Joseph M. Lukens
Abstract:
As practical quantum networks prepare to serve an ever-expanding number of nodes, there has grown a need for advanced auxiliary classical systems that support the quantum protocols and maintain compatibility with the existing fiber-optic infrastructure. We propose and demonstrate a quantum local area network design that addresses current deployment limitations in timing and security in a scalable…
▽ More
As practical quantum networks prepare to serve an ever-expanding number of nodes, there has grown a need for advanced auxiliary classical systems that support the quantum protocols and maintain compatibility with the existing fiber-optic infrastructure. We propose and demonstrate a quantum local area network design that addresses current deployment limitations in timing and security in a scalable fashion using commercial off-the-shelf components. We employ White Rabbit switches to synchronize three remote nodes with ultra-low timing jitter, significantly increasing the fidelities of the distributed entangled states over previous work with Global Positioning System clocks. Second, using a parallel quantum key distribution channel, we secure the classical communications needed for instrument control and data management. In this way, the conventional network which manages our entanglement network is secured using keys generated via an underlying quantum key distribution layer, preserving the integrity of the supporting systems and the relevant data in a future-proof fashion.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods
Authors:
Wenqing Zheng,
Edward W Huang,
Nikhil Rao,
Sumeet Katariya,
Zhangyang Wang,
Karthik Subbian
Abstract:
Graph Neural Networks (GNNs) have achieved state-of-the-art performance in node classification, regression, and recommendation tasks. GNNs work well when rich and high-quality connections are available. However, their effectiveness is often jeopardized in many real-world graphs in which node degrees have power-law distributions. The extreme case of this situation, where a node may have no neighbor…
▽ More
Graph Neural Networks (GNNs) have achieved state-of-the-art performance in node classification, regression, and recommendation tasks. GNNs work well when rich and high-quality connections are available. However, their effectiveness is often jeopardized in many real-world graphs in which node degrees have power-law distributions. The extreme case of this situation, where a node may have no neighbors, is called Strict Cold Start (SCS). SCS forces the prediction to rely completely on the node's own features. We propose Cold Brew, a teacher-student distillation approach to address the SCS and noisy-neighbor challenges for GNNs. We also introduce feature contribution ratio (FCR), a metric to quantify the behavior of inductive GNNs to solve SCS. We experimentally show that FCR disentangles the contributions of different graph data components and helps select the best architecture for SCS generalization. We further demonstrate the superior performance of Cold Brew on several public benchmark and proprietary e-commerce datasets, where many nodes have either very few or noisy connections. Our source code is available at https://github.com/amazon-research/gnn-tail-generalization.
△ Less
Submitted 13 March, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Lessons Learned on the Interface between Quantum and Conventional Networking
Authors:
Muneer Alshowkan,
Nageswara S. V. Rao,
Joseph C. Chapman,
Brian P. Williams,
Philip G. Evans,
Raphael C. Pooser,
Joseph M. Lukens,
Nicholas A. Peters
Abstract:
The future Quantum Internet is expected to be based on a hybrid architecture with core quantum transport capabilities complemented by conventional networking.Practical and foundational considerations indicate the need for conventional control and data planes that (i) utilize extensive existing telecommunications fiber infrastructure, and (ii) provide parallel conventional data channels needed for…
▽ More
The future Quantum Internet is expected to be based on a hybrid architecture with core quantum transport capabilities complemented by conventional networking.Practical and foundational considerations indicate the need for conventional control and data planes that (i) utilize extensive existing telecommunications fiber infrastructure, and (ii) provide parallel conventional data channels needed for quantum networking protocols. We propose a quantum-conventional network (QCN) harness to implement a new architecture to meet these requirements. The QCN control plane carries the control and management traffic, whereas its data plane handles the conventional and quantum data communications. We established a local area QCN connecting three quantum laboratories over dedicated fiber and conventional network connections. We describe considerations and tradeoffs for layering QCN functionalities, informed by our recent quantum entanglement distribution experiments conducted over this network.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
Cluster-and-Conquer: A Framework For Time-Series Forecasting
Authors:
Reese Pathak,
Rajat Sen,
Nikhil Rao,
N. Benjamin Erichson,
Michael I. Jordan,
Inderjit S. Dhillon
Abstract:
We propose a three-stage framework for forecasting high-dimensional time-series data. Our method first estimates parameters for each univariate time series. Next, we use these parameters to cluster the time series. These clusters can be viewed as multivariate time series, for which we then compute parameters. The forecasted values of a single time series can depend on the history of other time ser…
▽ More
We propose a three-stage framework for forecasting high-dimensional time-series data. Our method first estimates parameters for each univariate time series. Next, we use these parameters to cluster the time series. These clusters can be viewed as multivariate time series, for which we then compute parameters. The forecasted values of a single time series can depend on the history of other time series in the same cluster, accounting for intra-cluster similarity while minimizing potential noise in predictions by ignoring inter-cluster effects. Our framework -- which we refer to as "cluster-and-conquer" -- is highly general, allowing for any time-series forecasting and clustering method to be used in each step. It is computationally efficient and embarrassingly parallel. We motivate our framework with a theoretical analysis in an idealized mixed linear regression setting, where we provide guarantees on the quality of the estimates. We accompany these guarantees with experimental results that demonstrate the advantages of our framework: when instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets, sometimes outperforming deep-learning-based approaches.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs
Authors:
Nurendra Choudhary,
Nikhil Rao,
Sumeet Katariya,
Karthik Subbian,
Chandan K. Reddy
Abstract:
Logical reasoning over Knowledge Graphs (KGs) is a fundamental technique that can provide efficient querying mechanism over large and incomplete databases. Current approaches employ spatial geometries such as boxes to learn query representations that encompass the answer entities and model the logical operations of projection and intersection. However, their geometry is restrictive and leads to no…
▽ More
Logical reasoning over Knowledge Graphs (KGs) is a fundamental technique that can provide efficient querying mechanism over large and incomplete databases. Current approaches employ spatial geometries such as boxes to learn query representations that encompass the answer entities and model the logical operations of projection and intersection. However, their geometry is restrictive and leads to non-smooth strict boundaries, which further results in ambiguous answer entities. Furthermore, previous works propose transformation tricks to handle unions which results in non-closure and, thus, cannot be chained in a stream. In this paper, we propose a Probabilistic Entity Representation Model (PERM) to encode entities as a Multivariate Gaussian density with mean and covariance parameters to capture its semantic position and smooth decision boundary, respectively. Additionally, we also define the closed logical operations of projection, intersection, and union that can be aggregated using an end-to-end objective function. On the logical query reasoning problem, we demonstrate that the proposed PERM significantly outperforms the state-of-the-art methods on various public benchmark KG datasets on standard evaluation metrics. We also evaluate PERM's competence on a COVID-19 drug-repurposing case study and show that our proposed work is able to recommend drugs with substantially better F1 than current methods. Finally, we demonstrate the working of our PERM's query answering process through a low-dimensional visualization of the Gaussian representations.
△ Less
Submitted 30 October, 2021; v1 submitted 26 October, 2021;
originally announced October 2021.
-
TorchEsegeta: Framework for Interpretability and Explainability of Image-based Deep Learning Models
Authors:
Soumick Chatterjee,
Arnab Das,
Chirag Mandal,
Budhaditya Mukhopadhyay,
Manish Vipinraj,
Aniruddh Shukla,
Rajatha Nagaraja Rao,
Chompunuch Sarasaen,
Oliver Speck,
Andreas Nürnberger
Abstract:
Clinicians are often very sceptical about applying automatic image processing approaches, especially deep learning based methods, in practice. One main reason for this is the black-box nature of these approaches and the inherent problem of missing insights of the automatically derived decisions. In order to increase trust in these methods, this paper presents approaches that help to interpret and…
▽ More
Clinicians are often very sceptical about applying automatic image processing approaches, especially deep learning based methods, in practice. One main reason for this is the black-box nature of these approaches and the inherent problem of missing insights of the automatically derived decisions. In order to increase trust in these methods, this paper presents approaches that help to interpret and explain the results of deep learning algorithms by depicting the anatomical areas which influence the decision of the algorithm most. Moreover, this research presents a unified framework, TorchEsegeta, for applying various interpretability and explainability techniques for deep learning models and generate visual interpretations and explanations for clinicians to corroborate their clinical findings. In addition, this will aid in gaining confidence in such methods. The framework builds on existing interpretability and explainability techniques that are currently focusing on classification models, extending them to segmentation tasks. In addition, these methods have been adapted to 3D models for volumetric analysis. The proposed framework provides methods to quantitatively compare visual explanations using infidelity and sensitivity metrics. This framework can be used by data scientists to perform post-hoc interpretations and explanations of their models, develop more explainable tools and present the findings to clinicians to increase their faith in such models. The proposed framework was evaluated based on a use case scenario of vessel segmentation models trained on Time-of-fight (TOF) Magnetic Resonance Angiogram (MRA) images of the human brain. Quantitative and qualitative results of a comparative study of different models and interpretability methods are presented. Furthermore, this paper provides an extensive overview of several existing interpretability and explainability methods.
△ Less
Submitted 7 February, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
AstroSat study of the globular cluster NGC 2298: probable evolutionary scenarios of hot HB stars
Authors:
Sharmila Rani,
Gajendra Pandey,
Annapurni Subramaniam,
Chul Chung,
Snehalata Sahu,
N. Kameswara Rao
Abstract:
We present the far-UV (FUV) photometry of images acquired with UVIT on AstroSat to probe the horizontal branch (HB) population of the Galactic globular cluster NGC 2298. UV-optical color-magnitude diagrams (CMDs) are constructed for member stars in combination with HST UV Globular Cluster Survey (HUGS) data for the central region and Gaia and ground-based photometric data for the outer region. Blu…
▽ More
We present the far-UV (FUV) photometry of images acquired with UVIT on AstroSat to probe the horizontal branch (HB) population of the Galactic globular cluster NGC 2298. UV-optical color-magnitude diagrams (CMDs) are constructed for member stars in combination with HST UV Globular Cluster Survey (HUGS) data for the central region and Gaia and ground-based photometric data for the outer region. Blue HB (BHB) sequence with a spread and four hot HB stars are detected in all FUV-optical CMDs and are compared with theoretical updated BaSTI isochrones and synthetic HB models with a range in helium abundance, suggesting that the hot HB stars are helium enhanced when compared to the BHB. The estimated effective temperature, radius, and luminosity of HB stars, using best SED fits, were compared with various HB models. BHB stars span a temperature range from 7,500-12,250 K. The three hot HB stars have 35,000-40,000 K, whereas one star has around 100,000K. We suggest the following evolutionary scenarios: two stars are likely to be the progeny of extreme HB (EHB) stars formed through an early hot-flasher scenario; one is likely to be an EHB star with probable helium enrichment, the hottest HB star is about to enter the WD cooling phase, could have evolved from BHB phase. Nevertheless, these are interesting spectroscopic targets to understand the late stages of evolution.
△ Less
Submitted 9 October, 2021;
originally announced October 2021.
-
Emergent behavior and neural dynamics in artificial agents tracking turbulent plumes
Authors:
Satpreet Harcharan Singh,
Floris van Breugel,
Rajesh P. N. Rao,
Bingni Wen Brunton
Abstract:
Tracking a turbulent plume to locate its source is a complex control problem because it requires multi-sensory integration and must be robust to intermittent odors, changing wind direction, and variable plume statistics. This task is routinely performed by flying insects, often over long distances, in pursuit of food or mates. Several aspects of this remarkable behavior have been studied in detail…
▽ More
Tracking a turbulent plume to locate its source is a complex control problem because it requires multi-sensory integration and must be robust to intermittent odors, changing wind direction, and variable plume statistics. This task is routinely performed by flying insects, often over long distances, in pursuit of food or mates. Several aspects of this remarkable behavior have been studied in detail in many experimental studies. Here, we take a complementary in silico approach, using artificial agents trained with reinforcement learning to develop an integrated understanding of the behaviors and neural computations that support plume tracking. Specifically, we use deep reinforcement learning (DRL) to train recurrent neural network (RNN) agents to locate the source of simulated turbulent plumes. Interestingly, the agents' emergent behaviors resemble those of flying insects, and the RNNs learn to represent task-relevant variables, such as head direction and time since last odor encounter. Our analyses suggest an intriguing experimentally testable hypothesis for tracking plumes in changing wind direction -- that agents follow local plume shape rather than the current wind direction. While reflexive short-memory behaviors are sufficient for tracking plumes in constant wind, longer timescales of memory are essential for tracking plumes that switch direction. At the level of neural dynamics, the RNNs' population activity is low-dimensional and organized into distinct dynamical structures, with some correspondence to behavioral modules. Our in silico approach provides key intuitions for turbulent plume tracking strategies and motivates future targeted experimental and theoretical developments.
△ Less
Submitted 17 December, 2021; v1 submitted 25 September, 2021;
originally announced September 2021.
-
Scalable Feature Selection for (Multitask) Gradient Boosted Trees
Authors:
Cuize Han,
Nikhil Rao,
Daria Sorokina,
Karthik Subbian
Abstract:
Gradient Boosted Decision Trees (GBDTs) are widely used for building ranking and relevance models in search and recommendation. Considerations such as latency and interpretability dictate the use of as few features as possible to train these models. Feature selection in GBDT models typically involves heuristically ranking the features by importance and selecting the top few, or by performing a ful…
▽ More
Gradient Boosted Decision Trees (GBDTs) are widely used for building ranking and relevance models in search and recommendation. Considerations such as latency and interpretability dictate the use of as few features as possible to train these models. Feature selection in GBDT models typically involves heuristically ranking the features by importance and selecting the top few, or by performing a full backward feature elimination routine. On-the-fly feature selection methods proposed previously scale suboptimally with the number of features, which can be daunting in high dimensional settings. We develop a scalable forward feature selection variant for GBDT, via a novel group testing procedure that works well in high dimensions, and enjoys favorable theoretical performance and computational guarantees. We show via extensive experiments on both public and proprietary datasets that the proposed method offers significant speedups in training time, while being as competitive as existing GBDT methods in terms of model performance metrics. We also extend the method to the multitask setting, allowing the practitioner to select common features across tasks, as well as selecting task-specific features.
△ Less
Submitted 4 September, 2021;
originally announced September 2021.
-
Maximizing and Satisficing in Multi-armed Bandits with Graph Information
Authors:
Parth K. Thaker,
Mohit Malu,
Nikhil Rao,
Gautam Dasarathy
Abstract:
Pure exploration in multi-armed bandits has emerged as an important framework for modeling decision-making and search under uncertainty. In modern applications, however, one is often faced with a tremendously large number of options. Even obtaining one observation per option may be too costly rendering traditional pure exploration algorithms ineffective. Fortunately, one often has access to simila…
▽ More
Pure exploration in multi-armed bandits has emerged as an important framework for modeling decision-making and search under uncertainty. In modern applications, however, one is often faced with a tremendously large number of options. Even obtaining one observation per option may be too costly rendering traditional pure exploration algorithms ineffective. Fortunately, one often has access to similar relationships amongst the options that can be leveraged. In this paper, we consider the pure exploration problem in stochastic multi-armed bandits where the similarities between the arms are captured by a graph and the rewards may be represented as a smooth signal on this graph. In particular, we consider the problem of finding the arm with the maximum reward (i.e., the maximizing problem) or one with a sufficiently high reward (i.e., the satisficing problem) under this model. We propose novel algorithms \textbf{\algoname{}} (GRaph-based UcB) and $ζ$-\textbf{\algoname{}} for these problems and provide a theoretical characterization of their performance which specifically elicits the benefit of the graph side information. We also prove a lower bound on the data requirement, showing a large class of problems where these algorithms are near-optimal. We complement our theory with experimental results that show the benefit of capitalizing on such side information.
△ Less
Submitted 20 November, 2022; v1 submitted 2 August, 2021;
originally announced August 2021.
-
Measurement of the cosmic ray helium energy spectrum from 70 GeV to 80 TeV with the DAMPE space mission
Authors:
F. Alemanno,
Q. An,
P. Azzarello,
F. C. T. Barbato,
P. Bernardini,
X. J. Bi,
M. S. Cai,
E. Catanzani,
J. Chang,
D. Y. Chen,
J. L. Chen,
Z. F. Chen,
M. Y. Cui,
T. S. Cui,
Y. X. Cui,
H. T. Dai,
A. D'Amone,
A. De Benedittis,
I. De Mitri,
F. de Palma,
M. Deliyergiyev,
M. Di Santo,
T. K. Dong,
Z. X. Dong,
G. Donvito
, et al. (120 additional authors not shown)
Abstract:
The measurement of the energy spectrum of cosmic ray helium nuclei from 70 GeV to 80 TeV using 4.5 years of data recorded by the DArk Matter Particle Explorer (DAMPE) is reported in this work. A hardening of the spectrum is observed at an energy of about 1.3 TeV, similar to previous observations. In addition, a spectral softening at about 34 TeV is revealed for the first time with large statistics…
▽ More
The measurement of the energy spectrum of cosmic ray helium nuclei from 70 GeV to 80 TeV using 4.5 years of data recorded by the DArk Matter Particle Explorer (DAMPE) is reported in this work. A hardening of the spectrum is observed at an energy of about 1.3 TeV, similar to previous observations. In addition, a spectral softening at about 34 TeV is revealed for the first time with large statistics and well controlled systematic uncertainties, with an overall significance of $4.3σ$. The DAMPE spectral measurements of both cosmic protons and helium nuclei suggest a particle charge dependent softening energy, although with current uncertainties a dependence on the number of nucleons cannot be ruled out.
△ Less
Submitted 21 May, 2021; v1 submitted 19 May, 2021;
originally announced May 2021.
-
A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations
Authors:
Varun Nagaraj Rao,
Xingjian Zhen,
Karen Hovsepian,
Mingwei Shen
Abstract:
Explainable deep learning models are advantageous in many situations. Prior work mostly provide unimodal explanations through post-hoc approaches not part of the original system design. Explanation mechanisms also ignore useful textual information present in images. In this paper, we propose MTXNet, an end-to-end trainable multimodal architecture to generate multimodal explanations, which focuses…
▽ More
Explainable deep learning models are advantageous in many situations. Prior work mostly provide unimodal explanations through post-hoc approaches not part of the original system design. Explanation mechanisms also ignore useful textual information present in images. In this paper, we propose MTXNet, an end-to-end trainable multimodal architecture to generate multimodal explanations, which focuses on the text in the image. We curate a novel dataset TextVQA-X, containing ground truth visual and multi-reference textual explanations that can be leveraged during both training and evaluation. We then quantitatively show that training with multimodal explanations complements model performance and surpasses unimodal baselines by up to 7% in CIDEr scores and 2% in IoU. More importantly, we demonstrate that the multimodal explanations are consistent with human interpretations, help justify the models' decision, and provide useful insights to help diagnose an incorrect prediction. Finally, we describe a real-world e-commerce application for using the generated multimodal explanations.
△ Less
Submitted 28 April, 2021;
originally announced May 2021.
-
Variable selection for longitudinal survey data
Authors:
Laura Dumitrescu,
Wei Qian,
J. N. K. Rao
Abstract:
In this article we propose a new variable selection method for analyzing data collected from longitudinal sample surveys. The procedure is based on the survey-weighted quadratic inference function, which was recently introduced as an alternative to the survey-weighted generalized estimating function. Under the joint model-design framework, we introduce the penalized survey-weighted quadratic infer…
▽ More
In this article we propose a new variable selection method for analyzing data collected from longitudinal sample surveys. The procedure is based on the survey-weighted quadratic inference function, which was recently introduced as an alternative to the survey-weighted generalized estimating function. Under the joint model-design framework, we introduce the penalized survey-weighted quadratic inference estimator and obtain sufficient conditions for the existence, weak consistency, sparsity and asymptotic normality. To illustrate the finite sample performance of the model selection procedure, we include a limited simulation study.
△ Less
Submitted 2 May, 2021;
originally announced May 2021.
-
Vec2GC -- A Graph Based Clustering Method for Text Representations
Authors:
Rajesh N Rao,
Manojit Chakraborty
Abstract:
NLP pipelines with limited or no labeled data, rely on unsupervised methods for document processing. Unsupervised approaches typically depend on clustering of terms or documents. In this paper, we introduce a novel clustering algorithm, Vec2GC (Vector to Graph Communities), an end-to-end pipeline to cluster terms or documents for any given text corpus. Our method uses community detection on a weig…
▽ More
NLP pipelines with limited or no labeled data, rely on unsupervised methods for document processing. Unsupervised approaches typically depend on clustering of terms or documents. In this paper, we introduce a novel clustering algorithm, Vec2GC (Vector to Graph Communities), an end-to-end pipeline to cluster terms or documents for any given text corpus. Our method uses community detection on a weighted graph of the terms or documents, created using text representation learning. Vec2GC clustering algorithm is a density based approach, that supports hierarchical clustering as well.
△ Less
Submitted 12 April, 2023; v1 submitted 15 April, 2021;
originally announced April 2021.
-
SEMIE: SEMantically Infused Embeddings with Enhanced Interpretability for Domain-specific Small Corpus
Authors:
Rishabh Gupta,
Rajesh N Rao
Abstract:
Word embeddings are a basic building block of modern NLP pipelines. Efforts have been made to learn rich, efficient, and interpretable embeddings for large generic datasets available in the public domain. However, these embeddings have limited applicability for small corpora from specific domains such as automotive, manufacturing, maintenance and support, etc. In this work, we present a comprehens…
▽ More
Word embeddings are a basic building block of modern NLP pipelines. Efforts have been made to learn rich, efficient, and interpretable embeddings for large generic datasets available in the public domain. However, these embeddings have limited applicability for small corpora from specific domains such as automotive, manufacturing, maintenance and support, etc. In this work, we present a comprehensive notion of interpretability for word embeddings and propose a novel method to generate highly interpretable and efficient embeddings for a domain-specific small corpus. We report the evaluation results of our resulting word embeddings and demonstrate their novel features for enhanced interpretability.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.