subscribe to arXiv mailings

3D E-textile for Exercise Physiology and Clinical Maternal Health Monitoring

Authors: Junyi Zhao, Chansoo Kim, Weilun Li, Zichao Wen, Zhili Xiao, Yong Wang, Shantanu Chakrabartty, Chuan Wang

Abstract: Electronic textiles (E-textiles) offer great wearing comfort and unobtrusiveness, thus holding potential for next-generation health monitoring wearables. However, the practical implementation is hampered by challenges associated with poor signal quality, substantial motion artifacts, durability for long-term usage, and non-ideal user experience. Here, we report a cost-effective E-textile system th… ▽ More Electronic textiles (E-textiles) offer great wearing comfort and unobtrusiveness, thus holding potential for next-generation health monitoring wearables. However, the practical implementation is hampered by challenges associated with poor signal quality, substantial motion artifacts, durability for long-term usage, and non-ideal user experience. Here, we report a cost-effective E-textile system that features 3D microfiber-based electrodes for greatly increasing the surface area. The soft and fluffy conductive microfibers disperse freely and securely adhere to the skin, achieving a low impedance at the electrode-skin interface even in the absence of gel. A superhydrophobic fluorinated self-assembled monolayer was deposited on the E-textile surface to render it waterproof while retaining the electrical conductivity. Equipped with a custom-designed motion-artifact canceling wireless data recording circuit, the E-textile system could be integrated into a variety of smart garments for exercise physiology and health monitoring applications. Real-time multimodal electrophysiological signal monitoring, including electrocardiogram (ECG) and electromyography (EMG), was successfully carried out during strenuous cycling and even underwater swimming activities. Furthermore, a multi-channel E-textile was developed and implemented in clinical patient studies for simultaneous real-time monitoring of maternal ECG and uterine EMG signals, incorporating spatial-temporal potential mapping capabilities. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 16 pages, 6 figures

arXiv:2406.05224 [pdf, other]

ON-OFF Neuromorphic ISING Machines using Fowler-Nordheim Annealers

Authors: Zihao Chen, Zhili Xiao, Mahmoud Akl, Johannes Leugring, Omowuyi Olajide, Adil Malik, Nik Dennler, Chad Harper, Subhankar Bose, Hector A. Gonzalez, Jason Eshraghian, Riccardo Pignari, Gianvito Urgese, Andreas G. Andreou, Sadasivan Shankar, Christian Mayr, Gert Cauwenberghs, Shantanu Chakrabartty

Abstract: We introduce NeuroSA, a neuromorphic architecture specifically designed to ensure asymptotic convergence to the ground state of an Ising problem using an annealing process that is governed by the physics of quantum mechanical tunneling using Fowler-Nordheim (FN). The core component of NeuroSA consists of a pair of asynchronous ON-OFF neurons, which effectively map classical simulated annealing (SA… ▽ More We introduce NeuroSA, a neuromorphic architecture specifically designed to ensure asymptotic convergence to the ground state of an Ising problem using an annealing process that is governed by the physics of quantum mechanical tunneling using Fowler-Nordheim (FN). The core component of NeuroSA consists of a pair of asynchronous ON-OFF neurons, which effectively map classical simulated annealing (SA) dynamics onto a network of integrate-and-fire (IF) neurons. The threshold of each ON-OFF neuron pair is adaptively adjusted by an FN annealer which replicates the optimal escape mechanism and convergence of SA, particularly at low temperatures. To validate the effectiveness of our neuromorphic Ising machine, we systematically solved various benchmark MAX-CUT combinatorial optimization problems. Across multiple runs, NeuroSA consistently generates solutions that approach the state-of-the-art level with high accuracy (greater than 99%), and without any graph-specific hyperparameter tuning. For practical illustration, we present results from an implementation of NeuroSA on the SpiNNaker2 platform, highlighting the feasibility of mapping our proposed architecture onto a standard neuromorphic accelerator platform. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 36 pages, 8 figures

arXiv:2402.14878 [pdf, other]

Energy-efficiency Limits on Training AI Systems using Learning-in-Memory

Authors: Zihao Chen, Johannes Leugering, Gert Cauwenberghs, Shantanu Chakrabartty

Abstract: Learning-in-memory (LIM) is a recently proposed paradigm to overcome fundamental memory bottlenecks in training machine learning systems. While compute-in-memory (CIM) approaches can address the so-called memory-wall (i.e. energy dissipated due to repeated memory read access) they are agnostic to the energy dissipated due to repeated memory writes at the precision required for training (the update… ▽ More Learning-in-memory (LIM) is a recently proposed paradigm to overcome fundamental memory bottlenecks in training machine learning systems. While compute-in-memory (CIM) approaches can address the so-called memory-wall (i.e. energy dissipated due to repeated memory read access) they are agnostic to the energy dissipated due to repeated memory writes at the precision required for training (the update-wall), and they don't account for the energy dissipated when transferring information between short-term and long-term memories (the consolidation-wall). The LIM paradigm proposes that these bottlenecks, too, can be overcome if the energy barrier of physical memories is adaptively modulated such that the dynamics of memory updates and consolidation match the Lyapunov dynamics of gradient-descent training of an AI model. In this paper, we derive new theoretical lower bounds on energy dissipation when training AI systems using different LIM approaches. The analysis presented here is model-agnostic and highlights the trade-off between energy efficiency and the speed of training. The resulting non-equilibrium energy-efficiency bounds have a similar flavor as that of Landauer's energy-dissipation bounds. We also extend these limits by taking into account the number of floating-point operations (FLOPs) used for training, the size of the AI model, and the precision of the training parameters. Our projections suggest that the energy-dissipation lower-bound to train a brain scale AI system (comprising of $10^{15}$ parameters) using LIM is $10^8 \sim 10^9$ Joules, which is on the same magnitude the Landauer's adiabatic lower-bound and $6$ to $7$ orders of magnitude lower than the projections obtained using state-of-the-art AI accelerator hardware lower-bounds. △ Less

Submitted 21 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: 23 pages, 7 figures

arXiv:2402.04959 [pdf, other]

Margin Propagation based XOR-SAT Solvers for Decoding of LDPC Codes

Authors: Ankita Nandi, Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: Decoding of Low-Density Parity Check (LDPC) codes can be viewed as a special case of XOR-SAT problems, for which low-computational complexity bit-flipping algorithms have been proposed in the literature. However, a performance gap exists between the bit-flipping LDPC decoding algorithms and the benchmark LDPC decoding algorithms, such as the Sum-Product Algorithm (SPA). In this paper, we propose a… ▽ More Decoding of Low-Density Parity Check (LDPC) codes can be viewed as a special case of XOR-SAT problems, for which low-computational complexity bit-flipping algorithms have been proposed in the literature. However, a performance gap exists between the bit-flipping LDPC decoding algorithms and the benchmark LDPC decoding algorithms, such as the Sum-Product Algorithm (SPA). In this paper, we propose an XOR-SAT solver using log-sum-exponential functions and demonstrate its advantages for LDPC decoding. This is then approximated using the Margin Propagation formulation to attain a low-complexity LDPC decoder. The proposed algorithm uses soft information to decide the bit-flips that maximize the number of parity check constraints satisfied over an optimization function. The proposed solver can achieve results that are within $0.1$dB of the Sum-Product Algorithm for the same number of code iterations. It is also at least 10x lesser than other Gradient-Descent Bit Flipping decoding algorithms, which are also bit-flipping algorithms based on optimization functions. The approximation using the Margin Propagation formulation does not require any multipliers, resulting in significantly lower computational complexity than other soft-decision Bit-Flipping LDPC decoders. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 12 pages, 7 figures, Paper submitted to IEEE Transactions on Communications

arXiv:2304.13918 [pdf, other]

Neuromorphic Computing with AER using Time-to-Event-Margin Propagation

Authors: Madhuvanthi Srivatsav R, Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: Address-Event-Representation (AER) is a spike-routing protocol that allows the scaling of neuromorphic and spiking neural network (SNN) architectures to a size that is comparable to that of digital neural network architectures. However, in conventional neuromorphic architectures, the AER protocol and, in general, any virtual interconnect plays only a passive role in computation, i.e., only for rou… ▽ More Address-Event-Representation (AER) is a spike-routing protocol that allows the scaling of neuromorphic and spiking neural network (SNN) architectures to a size that is comparable to that of digital neural network architectures. However, in conventional neuromorphic architectures, the AER protocol and, in general, any virtual interconnect plays only a passive role in computation, i.e., only for routing spikes and events. In this paper, we show how causal temporal primitives like delay, triggering, and sorting inherent in the AER protocol itself can be exploited for scalable neuromorphic computing using our proposed technique called Time-to-Event Margin Propagation (TEMP). The proposed TEMP-based AER architecture is fully asynchronous and relies on interconnect delays for memory and computing as opposed to conventional and local multiply-and-accumulate (MAC) operations. We show that the time-based encoding in the TEMP neural network produces a spatio-temporal representation that can encode a large number of discriminatory patterns. As a proof-of-concept, we show that a trained TEMP-based convolutional neural network (CNN) can demonstrate an accuracy greater than 99% on the MNIST dataset. Overall, our work is a biologically inspired computing paradigm that brings forth a new dimension of research to the field of neuromorphic computing. △ Less

Submitted 26 April, 2023; originally announced April 2023.

arXiv:2304.11816 [pdf, other]

Multiplierless In-filter Computing for tinyML Platforms

Authors: Abhishek Ramdas Nair, Pallab Kumar Nath, Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: Wildlife conservation using continuous monitoring of environmental factors and biomedical classification, which generate a vast amount of sensor data, is a challenge due to limited bandwidth in the case of remote monitoring. It becomes critical to have classification where data is generated, and only classified data is used for monitoring. We present a novel multiplierless framework for in-filter… ▽ More Wildlife conservation using continuous monitoring of environmental factors and biomedical classification, which generate a vast amount of sensor data, is a challenge due to limited bandwidth in the case of remote monitoring. It becomes critical to have classification where data is generated, and only classified data is used for monitoring. We present a novel multiplierless framework for in-filter acoustic classification using Margin Propagation (MP) approximation used in low-power edge devices deployable in remote areas with limited connectivity. The entire design of this classification framework is based on template-based kernel machine, which include feature extraction and inference, and uses basic primitives like addition/subtraction, shift, and comparator operations, for hardware implementation. Unlike full precision training methods for traditional classification, we use MP-based approximation for training, including backpropagation mitigating approximation errors. The proposed framework is general enough for acoustic classification. However, we demonstrate the hardware friendliness of this framework by implementing a parallel Finite Impulse Response (FIR) filter bank in a kernel machine classifier optimized for a Field Programmable Gate Array (FPGA). The FIR filter acts as the feature extractor and non-linear kernel for the kernel machine implemented using MP approximation and a downsampling method to reduce the order of the filters. The FPGA implementation on Spartan 7 shows that the MP-approximated in-filter kernel machine is more efficient than traditional classification frameworks with just less than 1K slices. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2304.09242 [pdf, other]

A Framework for Analyzing Cross-correlators using Price's Theorem and Piecewise-Linear Decomposition

Authors: Zhili Xiao, Shantanu Chakrabartty

Abstract: Precise estimation of cross-correlation or similarity between two random variables lies at the heart of signal detection, hyperdimensional computing, associative memories, and neural networks. Although a vast literature exists on different methods for estimating cross-correlations, the question what is the best and simplest method to estimate cross-correlations using finite samples ? is still uncl… ▽ More Precise estimation of cross-correlation or similarity between two random variables lies at the heart of signal detection, hyperdimensional computing, associative memories, and neural networks. Although a vast literature exists on different methods for estimating cross-correlations, the question what is the best and simplest method to estimate cross-correlations using finite samples ? is still unclear. In this paper, we first argue that the standard empirical approach might not be the optimal method even though the estimator exhibits uniform convergence to the true cross-correlation. Instead, we show that there exists a large class of simple non-linear functions that can be used to construct cross-correlators with a higher signal-to-noise ratio (SNR). To demonstrate this, we first present a general mathematical framework using Price's Theorem that allows us to analyze cross-correlators constructed using a mixture of piece-wise linear functions. Using this framework and high-dimensional embedding, we show that some of the most promising cross-correlators are based on Huber's loss functions, margin-propagation (MP) functions, and the log-sum-exp (LSE) functions. △ Less

Submitted 31 October, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

arXiv:2206.14581 [pdf, other]

On-device Synaptic Memory Consolidation using Fowler-Nordheim Quantum-tunneling

Authors: Mustafizur Rahman, Subhankar Bose, Shantanu Chakrabartty

Abstract: Synaptic memory consolidation has been heralded as one of the key mechanisms for supporting continual learning in neuromorphic Artificial Intelligence (AI) systems. Here we report that a Fowler-Nordheim (FN) quantum-tunneling device can implement synaptic memory consolidation similar to what can be achieved by algorithmic consolidation models like the cascade and the elastic weight consolidation (… ▽ More Synaptic memory consolidation has been heralded as one of the key mechanisms for supporting continual learning in neuromorphic Artificial Intelligence (AI) systems. Here we report that a Fowler-Nordheim (FN) quantum-tunneling device can implement synaptic memory consolidation similar to what can be achieved by algorithmic consolidation models like the cascade and the elastic weight consolidation (EWC) models. The proposed FN-synapse not only stores the synaptic weight but also stores the synapse's historical usage statistic on the device itself. We also show that the operation of the FN-synapse is near-optimal in terms of the synaptic lifetime and we demonstrate that a network comprising FN-synapses outperforms a comparable EWC network for a small benchmark continual learning task. With an energy footprint of femtojoules per synaptic update, we believe that the proposed FN-synapse provides an ultra-energy-efficient approach for implementing both synaptic memory consolidation and persistent learning. △ Less

Submitted 27 June, 2022; originally announced June 2022.

arXiv:2205.05664 [pdf, other]

doi 10.1109/TCSI.2022.3216287

Process, Bias and Temperature Scalable CMOS Analog Computing Circuits for Machine Learning

Authors: Pratik Kumar, Ankita Nandi, Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: Analog computing is attractive compared to digital computing due to its potential for achieving higher computational density and higher energy efficiency. However, unlike digital circuits, conventional analog computing circuits cannot be easily mapped across different process nodes due to differences in transistor biasing regimes, temperature variations and limited dynamic range. In this work, we… ▽ More Analog computing is attractive compared to digital computing due to its potential for achieving higher computational density and higher energy efficiency. However, unlike digital circuits, conventional analog computing circuits cannot be easily mapped across different process nodes due to differences in transistor biasing regimes, temperature variations and limited dynamic range. In this work, we generalize the previously reported margin-propagation-based analog computing framework for designing novel \textit{shape-based analog computing} (S-AC) circuits that can be easily cross-mapped across different process nodes. Similar to digital designs S-AC designs can also be scaled for precision, speed, and power. As a proof-of-concept, we show several examples of S-AC circuits implementing mathematical functions that are commonly used in machine learning (ML) architectures. Using circuit simulations we demonstrate that the circuit input/output characteristics remain robust when mapped from a planar CMOS 180nm process to a FinFET 7nm process. Also, using benchmark datasets we demonstrate that the classification accuracy of a S-AC based neural network remains robust when mapped across the two processes and to changes in temperature. △ Less

Submitted 4 January, 2023; v1 submitted 11 May, 2022; originally announced May 2022.

Comments: 14 Pages, 15 Figures, 5 Tables. This work has been accepted in IEEE for publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2202.05022 [pdf, other]

doi 10.1109/JETCAS.2023.3234570

Bias-Scalable Near-Memory CMOS Analog Processor for Machine Learning

Authors: Pratik Kumar, Ankita Nandi, Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: Bias-scalable analog computing is attractive for implementing machine learning (ML) processors with distinct power-performance specifications. For instance, ML implementations for server workloads are focused on higher computational throughput for faster training, whereas ML implementations for edge devices are focused on energy-efficient inference. In this paper, we demonstrate the implementation… ▽ More Bias-scalable analog computing is attractive for implementing machine learning (ML) processors with distinct power-performance specifications. For instance, ML implementations for server workloads are focused on higher computational throughput for faster training, whereas ML implementations for edge devices are focused on energy-efficient inference. In this paper, we demonstrate the implementation of bias-scalable approximate analog computing circuits using the generalization of the margin-propagation principle called shape-based analog computing (S-AC). The resulting S-AC core integrates several near-memory compute elements, which include: (a) non-linear activation functions; (b) inner-product compute circuits; and (c) a mixed-signal compressive memory, all of which can be scaled for performance or power while preserving its functionality. Using measured results from prototypes fabricated in a 180nm CMOS process, we demonstrate that the performance of computing modules remains robust to transistor biasing and variations in temperature. In this paper, we also demonstrate the effect of bias-scalability and computational accuracy on a simple ML regression task. △ Less

Submitted 4 January, 2023; v1 submitted 10 February, 2022; originally announced February 2022.

Comments: 11 pages, 11 figures, 2 Tables

arXiv:2202.01064 [pdf, ps, other]

A Dynamical Systems Framework for Generating the Riemann Zeta Function and Dirichlet L-functions

Authors: Shantanu Chakrabartty

Abstract: We first construct a dynamical systems model which in its steady-state serves as an analytic continuation of the completed Riemann zeta function over the entire critical strip. The resulting mathematical construct involves a linear interpolation of two symmetric generator functions which can be used to infer the global properties of the non-trivial zeros of the Riemann zeta function using concentr… ▽ More We first construct a dynamical systems model which in its steady-state serves as an analytic continuation of the completed Riemann zeta function over the entire critical strip. The resulting mathematical construct involves a linear interpolation of two symmetric generator functions which can be used to infer the global properties of the non-trivial zeros of the Riemann zeta function using concentration bounds. The proposed dynamical systems framework thus provides an alternative method for investigating the celebrated Riemann Hypothesis which is shown in this paper to be almost surely true. We also show that the framework is general enough to study the non-trivial zeros of the Dirichlet L-functions and in this paper we show that under specific conditions, the generalized Riemann Hypothesis is also almost surely true. △ Less

Submitted 22 April, 2022; v1 submitted 25 January, 2022; originally announced February 2022.

Comments: 17 pages. Comments and suggestions to improve the derivation are welcome

arXiv:2109.06171 [pdf, other]

doi 10.1109/JIOT.2021.3109739

In-filter Computing For Designing Ultra-light Acoustic Pattern Recognizers

Authors: Abhishek Ramdas Nair, Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: We present a novel in-filter computing framework that can be used for designing ultra-light acoustic classifiers for use in smart internet-of-things (IoTs). Unlike a conventional acoustic pattern recognizer, where the feature extraction and classification are designed independently, the proposed architecture integrates the convolution and nonlinear filtering operations directly into the kernels of… ▽ More We present a novel in-filter computing framework that can be used for designing ultra-light acoustic classifiers for use in smart internet-of-things (IoTs). Unlike a conventional acoustic pattern recognizer, where the feature extraction and classification are designed independently, the proposed architecture integrates the convolution and nonlinear filtering operations directly into the kernels of a Support Vector Machine (SVM). The result of this integration is a template-based SVM whose memory and computational footprint (training and inference) is light enough to be implemented on an FPGA-based IoT platform. While the proposed in-filter computing framework is general enough, in this paper, we demonstrate this concept using a Cascade of Asymmetric Resonator with Inner Hair Cells (CAR-IHC) based acoustic feature extraction algorithm. The complete system has been optimized using time-multiplexing and parallel-pipeline techniques for a Xilinx Spartan 7 series Field Programmable Gate Array (FPGA). We show that the system can achieve robust classification performance on benchmark sound recognition tasks using only ~ 1.5k Look-Up Tables (LUTs) and ~ 2.8k Flip-Flops (FFs), a significant improvement over other approaches. △ Less

Submitted 11 September, 2021; originally announced September 2021.

Comments: in IEEE Internet of Things Journal

arXiv:2108.09537 [pdf, other]

Using growth transform dynamical systems for spatio-temporal data sonification

Authors: Oindrila Chatterjee, Shantanu Chakrabartty

Abstract: Sonification, or encoding information in meaningful audio signatures, has several advantages in augmenting or replacing traditional visualization methods for human-in-the-loop decision-making. Standard sonification methods reported in the literature involve either (i) using only a subset of the variables, or (ii) first solving a learning task on the data and then mapping the output to an audio wav… ▽ More Sonification, or encoding information in meaningful audio signatures, has several advantages in augmenting or replacing traditional visualization methods for human-in-the-loop decision-making. Standard sonification methods reported in the literature involve either (i) using only a subset of the variables, or (ii) first solving a learning task on the data and then mapping the output to an audio waveform, which is utilized by the end-user to make a decision. This paper presents a novel framework for sonifying high-dimensional data using a complex growth transform dynamical system model where both the learning (or, more generally, optimization) and the sonification processes are integrated together. Our algorithm takes as input the data and optimization parameters underlying the learning or prediction task and combines it with the psychoacoustic parameters defined by the user. As a result, the proposed framework outputs binaural audio signatures that not only encode some statistical properties of the high-dimensional data but also reveal the underlying complexity of the optimization/learning process. Along with extensive experiments using synthetic datasets, we demonstrate the framework on sonifying Electro-encephalogram (EEG) data with the potential for detecting epileptic seizures in pediatric patients. △ Less

Submitted 21 August, 2021; originally announced August 2021.

Comments: This article was submitted to PLoS One in March, 2021 and is currently under peer review

arXiv:2106.01958 [pdf, other]

Multiplierless MP-Kernel Machine For Energy-efficient Edge Devices

Authors: Abhishek Ramdas Nair, Pallab Kumar Nath, Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: We present a novel framework for designing multiplierless kernel machines that can be used on resource-constrained platforms like intelligent edge devices. The framework uses a piecewise linear (PWL) approximation based on a margin propagation (MP) technique and uses only addition/subtraction, shift, comparison, and register underflow/overflow operations. We propose a hardware-friendly MP-based in… ▽ More We present a novel framework for designing multiplierless kernel machines that can be used on resource-constrained platforms like intelligent edge devices. The framework uses a piecewise linear (PWL) approximation based on a margin propagation (MP) technique and uses only addition/subtraction, shift, comparison, and register underflow/overflow operations. We propose a hardware-friendly MP-based inference and online training algorithm that has been optimized for a Field Programmable Gate Array (FPGA) platform. Our FPGA implementation eliminates the need for DSP units and reduces the number of LUTs. By reusing the same hardware for inference and training, we show that the platform can overcome classification errors and local minima artifacts that result from the MP approximation. The implementation of this proposed multiplierless MP-kernel machine on FPGA results in an estimated energy consumption of 13.4 pJ and power consumption of 107 mW with ~9k LUTs and FFs each for a 256 x 32 sized kernel making it superior in terms of power, performance, and area compared to other comparable implementations. △ Less

Submitted 9 September, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

arXiv:2104.05926 [pdf]

doi 10.1038/s41467-022-29320-6

An Adaptive Synaptic Array using Fowler-Nordheim Dynamic Analog Memory

Authors: Darshit Mehta, Kenji Aono, Shantanu Chakrabartty

Abstract: In this paper we present a synaptic array that uses dynamical states to implement an analog memory for energy-efficient training of machine learning (ML) systems. Each of the analog memory elements is a micro-dynamical system that is driven by the physics of Fowler-Nordheim (FN) quantum tunneling, whereas the system level learning modulates the state trajectory of the memory ensembles towards the… ▽ More In this paper we present a synaptic array that uses dynamical states to implement an analog memory for energy-efficient training of machine learning (ML) systems. Each of the analog memory elements is a micro-dynamical system that is driven by the physics of Fowler-Nordheim (FN) quantum tunneling, whereas the system level learning modulates the state trajectory of the memory ensembles towards the optimal solution. We show that the extrinsic energy required for modulation can be matched to the dynamics of learning and weight decay leading to a significant reduction in the energy-dissipated during ML training. With the energy-dissipation as low as 5 fJ per memory update and a programming resolution up to 14 bits, the proposed synapse array could be used to address the energy-efficiency imbalance between the training and the inference phases observed in artificial intelligence (AI) systems. △ Less

Submitted 13 April, 2021; originally announced April 2021.

Comments: 22 pages (incl. 7 supplementary pages), 11 figures (incl. 6 supplementary figures)

arXiv:2104.04553 [pdf, other]

SPoTKD: A Protocol for Symmetric Key Distribution over Public Channels Using Self-Powered Timekeeping Devices

Authors: Mustafizur Rahman, Liang Zhou, Shantanu Chakrabartty

Abstract: In this paper, we propose a novel class of symmetric key distribution protocols that leverages basic security primitives offered by low-cost, hardware chipsets containing millions of synchronized self-powered timers. The keys are derived from the temporal dynamics of a physical, micro-scale time-keeping device which makes the keys immune to any potential side-channel attacks, malicious tampering,… ▽ More In this paper, we propose a novel class of symmetric key distribution protocols that leverages basic security primitives offered by low-cost, hardware chipsets containing millions of synchronized self-powered timers. The keys are derived from the temporal dynamics of a physical, micro-scale time-keeping device which makes the keys immune to any potential side-channel attacks, malicious tampering, or snooping. Using the behavioral model of the self-powered timers, we first show that the derived key-strings can pass the randomness test as defined by the National Institute of Standards and Technology (NIST) suite. The key-strings are then used in two SPoTKD (Self-Powered Timer Key Distribution) protocols that exploit the timer's dynamics as one-way functions: (a) protocol 1 facilitates secure communications between a user and a remote Server, and (b) protocol 2 facilitates secure communications between two users. In this paper, we investigate the security of these protocols under standard model and against different adversarial attacks. Using Monte-Carlo simulations, we also investigate the robustness of these protocols in the presence of real-world operating conditions and propose error-correcting SPoTKD protocols to mitigate these noise-related artifacts. △ Less

Submitted 17 January, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: 14 pages, 12 figures

arXiv:2004.12256 [pdf, other]

doi 10.1038/s41467-020-19292-w

A Self-powered Analog Sensor-data-logging Device based on Fowler-Nordheim Dynamical Systems

Authors: Darshit Mehta, Kenji Aono, Shantanu Chakrabartty

Abstract: Continuous, battery-free operation of sensor nodes requires ultra-low-power sensing and data-logging techniques. Here we report that by directly coupling a sensor/transducer signal into globally asymptotically stable monotonic dynamical systems based on Fowler-Nordheim quantum tunneling, one can achieve self-powered sensing at an energy budget that is currently unachievable using conventional ener… ▽ More Continuous, battery-free operation of sensor nodes requires ultra-low-power sensing and data-logging techniques. Here we report that by directly coupling a sensor/transducer signal into globally asymptotically stable monotonic dynamical systems based on Fowler-Nordheim quantum tunneling, one can achieve self-powered sensing at an energy budget that is currently unachievable using conventional energy harvesting methods. The proposed device uses a differential architecture to compensate for environmental variations and the device can retain sensed information for durations ranging from hours to days. With a theoretical operating energy budget less than 10 attojoules, we demonstrate that when integrated with a miniature piezoelectric transducer the proposed sensor-data-logger can measure cumulative "action" due to ambient mechanical acceleration without any additional external power. △ Less

Submitted 3 October, 2020; v1 submitted 25 April, 2020; originally announced April 2020.

Comments: 24 pages (including 11 supplementary pages) and 16 figures (including 11 supplementary figures)

arXiv:1910.02304 [pdf, other]

Multiplierless and Sparse Machine Learning based on Margin Propagation Networks

Authors: Nazreen P. M., Shantanu Chakrabartty, Chetan Singh Thakur

Abstract: The new generation of machine learning processors have evolved from multi-core and parallel architectures that were designed to efficiently implement matrix-vector-multiplications (MVMs). This is because at the fundamental level, neural network and machine learning operations extensively use MVM operations and hardware compilers exploit the inherent parallelism in MVM operations to achieve hardwar… ▽ More The new generation of machine learning processors have evolved from multi-core and parallel architectures that were designed to efficiently implement matrix-vector-multiplications (MVMs). This is because at the fundamental level, neural network and machine learning operations extensively use MVM operations and hardware compilers exploit the inherent parallelism in MVM operations to achieve hardware acceleration on GPUs and FPGAs. However, many IoT and edge computing platforms require embedded ML devices close to the network in order to compensate for communication cost and latency. Hence a natural question to ask is whether MVM operations are even necessary to implement ML algorithms and whether simpler hardware primitives can be used to implement an ultra-energy-efficient ML processor/architecture. In this paper we propose an alternate hardware-software codesign of ML and neural network architectures where instead of using MVM operations and non-linear activation functions, the architecture only uses simple addition and thresholding operations to implement inference and learning. At the core of the proposed approach is margin-propagation (MP) based computation that maps multiplications into additions and additions into a dynamic rectifying-linear-unit (ReLU) operations. This mapping results in significant improvement in computational and hence energy cost. In this paper, we show how the MP network formulation can be applied for designing linear classifiers, shallow multi-layer perceptrons and support vector networks suitable fot IoT platforms and tiny ML applications. We show that these MP based classifiers give comparable results to that of their traditional counterparts for benchmark UCI datasets, with the added advantage of reduction in computational complexity enabling an improvement in energy efficiency. △ Less

Submitted 5 November, 2020; v1 submitted 5 October, 2019; originally announced October 2019.

Comments: New results added

arXiv:1908.05377 [pdf, other]

Resonant Machine Learning Based on Complex Growth Transform Dynamical Systems

Authors: Oindrila Chatterjee, Shantanu Chakrabartty

Abstract: Traditional energy-based learning models associate a single energy metric to each configuration of variables involved in the underlying optimization process. Such models associate the lowest energy state to the optimal configuration of variables under consideration, and are thus inherently dissipative. In this paper we propose an energy-efficient learning framework that exploits structural and fun… ▽ More Traditional energy-based learning models associate a single energy metric to each configuration of variables involved in the underlying optimization process. Such models associate the lowest energy state to the optimal configuration of variables under consideration, and are thus inherently dissipative. In this paper we propose an energy-efficient learning framework that exploits structural and functional similarities between a machine learning network and a general electrical network satisfying the Tellegen's theorem. In contrast to the standard energy-based models, the proposed formulation associates two energy components, namely, active and reactive energy to the network. This ensures that the network's active-power is dissipated only during the process of learning, whereas the reactive-power is maintained to be zero at all times. As a result, in steady-state, the learned parameters are stored and self-sustained by electrical resonance determined by the network's nodal inductances and capacitances. Based on this approach, this paper introduces three novel concepts: (a) A learning framework where the network's active-power dissipation is used as a regularization for a learning objective function that is subjected to zero total reactive-power constraint; (b) A dynamical system based on complex-domain, continuous-time growth transforms which optimizes the learning objective function and drives the network towards electrical resonance under steady-state operation; and (c) An annealing procedure that controls the trade-off between active-power dissipation and the speed of convergence. As a representative example, we show how the proposed framework can be used for designing resonant support vector machines (SVMs), where we show that the support-vectors correspond to an LC network with self-sustained oscillations. △ Less

Submitted 9 April, 2020; v1 submitted 14 August, 2019; originally announced August 2019.

Comments: Version3, accepted in IEEE TNNLS, March 2020

arXiv:1903.12330 [pdf, other]

doi 10.1109/MWSCAS.2019.8885180

Neuromorphic In-Memory Computing Framework using Memtransistor Cross-bar based Support Vector Machines

Authors: P. Kumar, A. R. Nair, O. Chatterjee, T. Paul, A. Ghosh, S. Chakrabartty, C. S. Thakur

Abstract: This paper presents a novel framework for designing support vector machines (SVMs), which does not impose restriction on the SVM kernel to be positive-definite and allows the user to define memory constraint in terms of fixed template vectors. This makes the framework scalable and enables its implementation for low-power, high-density and memory constrained embedded application. An efficient hardw… ▽ More This paper presents a novel framework for designing support vector machines (SVMs), which does not impose restriction on the SVM kernel to be positive-definite and allows the user to define memory constraint in terms of fixed template vectors. This makes the framework scalable and enables its implementation for low-power, high-density and memory constrained embedded application. An efficient hardware implementation of the same is also discussed, which utilizes novel low power memtransistor based cross-bar architecture, and is robust to device mismatch and randomness. We used memtransistor measurement data, and showed that the designed SVMs can achieve classification accuracy comparable to traditional SVMs on both synthetic and real-world benchmark datasets. This framework would be beneficial for design of SVM based wake-up systems for internet of things (IoTs) and edge devices where memtransistors can be used to optimize system's energy-efficiency and perform in-memory matrix-vector multiplication (MVM). △ Less

Submitted 29 May, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

Comments: 4 pages, 5 figures, MWSCAS 2019

Journal ref: 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)

arXiv:1811.02010 [pdf, other]

A Unified Perspective of Evolutionary Game Dynamics Using Generalized Growth Transforms

Authors: Oindrila Chatterjee, Shantanu Chakrabartty

Abstract: In this paper, we show that different types of evolutionary game dynamics are, in principle, special cases of a dynamical system model based on our previously reported framework of generalized growth transforms. The framework shows that different dynamics arise as a result of minimizing a population energy such that the population as a whole evolves to reach the most stable state. By introducing a… ▽ More In this paper, we show that different types of evolutionary game dynamics are, in principle, special cases of a dynamical system model based on our previously reported framework of generalized growth transforms. The framework shows that different dynamics arise as a result of minimizing a population energy such that the population as a whole evolves to reach the most stable state. By introducing a population dependent time-constant in the generalized growth transform model, the proposed framework can be used to explain a vast repertoire of evolutionary dynamics, including some novel forms of game dynamics with non-linear payoffs. △ Less

Submitted 5 November, 2018; originally announced November 2018.

arXiv:1711.11032 [pdf, other]

A Fowler-Nordheim Integrator can Track the Density of Prime Numbers

Authors: Liang Zhou, SriHarsha Kondapalli, Shantanu Chakrabartty

Abstract: "Does there exist a naturally occurring counting device that might elucidate the hidden structure of prime numbers ?" is a question that has fascinated computer scientists and mathematical physicists for decades. While most recent research in this area have explored the role of the Riemann zeta-function in different formulations of statistical mechanics, condensed matter physics and quantum chaoti… ▽ More "Does there exist a naturally occurring counting device that might elucidate the hidden structure of prime numbers ?" is a question that has fascinated computer scientists and mathematical physicists for decades. While most recent research in this area have explored the role of the Riemann zeta-function in different formulations of statistical mechanics, condensed matter physics and quantum chaotic systems, the resulting devices (quantum or classical) have only existed in theory or the fabrication of the device has been found to be not scalable to large prime numbers. Here we report for the first time that any hypothetical prime number generator, to our knowledge, has to be a special case of a dynamical system that is governed by the physics of Fowler-Nordheim (FN) quantum-tunneling. In this paper we report how such a dynamical system can be implemented using a counting process that naturally arises from sequential FN tunneling and integration of electrons on a floating-gate (FG) device. The self-compensating physics of the FG device makes the operation reliable and repeatable even when tunneling-currents approach levels below 1 attoamperes. We report measured results from different variants of fabricated prototypes, each of which shows an excellent match with the asymptotic prime number statistics. We also report similarities between the spectral signatures produced by the FN device and the spectral statistics of a hypothetical prime number sequence generator. We believe that the proposed floating-gate device could have future implications in understanding the process that generates prime numbers with applications in security and authentication. △ Less

Submitted 24 November, 2017; originally announced November 2017.

Comments: 22 pages, 5 figures

arXiv:1707.06964 [pdf, other]

Global Optimization based on Growth Transform Dynamical Model

Authors: Oindrila Chatterjee, Shantanu Chakrabartty

Abstract: Conservation principles like conservation of charge or energy provide a natural way to couple and constrain different physical variables. In this letter, we propose a dynamical system model that exploits these constraints for solving non-convex global optimization problems. Unlike the traditional simulated annealing or quantum annealing based global optimization techniques, the proposed method opt… ▽ More Conservation principles like conservation of charge or energy provide a natural way to couple and constrain different physical variables. In this letter, we propose a dynamical system model that exploits these constraints for solving non-convex global optimization problems. Unlike the traditional simulated annealing or quantum annealing based global optimization techniques, the proposed method optimizes a target objective function by continuously evolving a driver functional over a conservation manifold using a generalized variant of growth transformations. As a result, the driver functional converges to a Dirac-delta function that is centered at the global optimum of the target objective function. We provide an outline of the proof of convergence for the dynamical system model and we demonstrate the application of the model for implementing linear-time and constant-time decentralized sorting algorithms. △ Less

Submitted 21 July, 2017; originally announced July 2017.

Comments: 5 pages, 3 figures

arXiv:1707.06363 [pdf, other]

Energy-dissipation Limits in Variance-based Computing

Authors: Sri Harsha Kondapalli, Xuan Zhang, Shantanu Chakrabartty

Abstract: Variance-based logic (VBL) uses the fluctuations or the variance in the state of a particle or a physical quantity to represent different logic levels. In this letter we show that compared to the traditional bi-stable logic representation the variance-based representation can theoretically achieve a superior performance trade-off (in terms of energy dissipation and information capacity) when opera… ▽ More Variance-based logic (VBL) uses the fluctuations or the variance in the state of a particle or a physical quantity to represent different logic levels. In this letter we show that compared to the traditional bi-stable logic representation the variance-based representation can theoretically achieve a superior performance trade-off (in terms of energy dissipation and information capacity) when operating at fundamental limits imposed by thermal-noise. We show that for a bi-stable logic device the lower limit on energy dissipated per bit is 4.35KT/bit, whereas under similar operating conditions, a VBL device could achieve a lower limit of sub-KT/bit. These theoretical results are general enough to be applicable to different instantiations and variants of VBL ranging from digital processors based on energy-scavenging or to processors based on the emerging valleytronic devices. △ Less

Submitted 19 July, 2017; originally announced July 2017.

arXiv:1503.03297 [pdf, ps, other]

Uncertainty in Test Score Data and Classically Defined Reliability of Tests and Test Batteries, using a New Method for Test Dichotomisation

Authors: Satyendra Nath Chakrabartty, Kangrui Wang, Dalia Chakrabarty

Abstract: As with all measurements, the measurement of examinee ability, in terms of scores that the examinee obtains in a test, is also error-ridden. The quantification of such error or uncertainty in the test score data--or rather the complementary test reliability--is pursued within the paradigm of Classical Test Theory in a variety of ways, with no existing method of finding reliability, isomorphic to t… ▽ More As with all measurements, the measurement of examinee ability, in terms of scores that the examinee obtains in a test, is also error-ridden. The quantification of such error or uncertainty in the test score data--or rather the complementary test reliability--is pursued within the paradigm of Classical Test Theory in a variety of ways, with no existing method of finding reliability, isomorphic to the theoretical definition that parametrises reliability as the ratio of the true score variance and observed (i.e. error-ridden) score variance. Thus, multiple reliability coefficients for the same test have been advanced. This paper describes a much needed method of obtaining reliability of a test as per its theoretical definition, via a single administration of the test, by using a new fast method of splitting of a given test into parallel halves, achieving near-coincident empirical distributions of the two halves. The method has the desirable property of achieving splitting on the basis of difficulty of the questions (or items) that constitute the test, thus allowing for fast computation of reliability even for very large test data sets, i.e. test data obtained by a very large examinee sample. An interval estimate for the true score is offered, given an examinee score, subsequent to the determination of the test reliability. This method of finding test reliability as per the classical definition can be extended to find reliability of a set or battery of tests; a method for determination of the weights implemented in the computation of the weighted battery score is discussed. We perform empirical illustration of our method on real and simulated tests, and on a real test battery comprising two constituent tests. △ Less

Submitted 12 March, 2015; v1 submitted 11 March, 2015; originally announced March 2015.

Comments: 30 pages

Showing 1–25 of 25 results for author: Chakrabartty, S