subscribe to arXiv mailings

RoboKoop: Efficient Control Conditioned Representations from Visual Input in Robotics using Koopman Operator

Authors: Hemant Kumawat, Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: Developing agents that can perform complex control tasks from high-dimensional observations is a core ability of autonomous agents that requires underlying robust task control policies and adapting the underlying visual representations to the task. Most existing policies need a lot of training samples and treat this problem from the lens of two-stage learning with a controller learned on top of pr… ▽ More Developing agents that can perform complex control tasks from high-dimensional observations is a core ability of autonomous agents that requires underlying robust task control policies and adapting the underlying visual representations to the task. Most existing policies need a lot of training samples and treat this problem from the lens of two-stage learning with a controller learned on top of pre-trained vision models. We approach this problem from the lens of Koopman theory and learn visual representations from robotic agents conditioned on specific downstream tasks in the context of learning stabilizing control for the agent. We introduce a Contrastive Spectral Koopman Embedding network that allows us to learn efficient linearized visual representations from the agent's visual data in a high dimensional latent space and utilizes reinforcement learning to perform off-policy control on top of the extracted representations with a linear controller. Our method enhances stability and control in gradient dynamics over time, significantly outperforming existing approaches by improving efficiency and accuracy in learning task policies over extended horizons. △ Less

Submitted 4 September, 2024; originally announced September 2024.

Comments: Accepted to the $8^{th}$ Conference on Robot Learning (CoRL 2024)

arXiv:2407.16062 [pdf, other]

Artificial Intelligence-based Decision Support Systems for Precision and Digital Health

Authors: Nina Deliu, Bibhas Chakraborty

Abstract: Precision health, increasingly supported by digital technologies, is a domain of research that broadens the paradigm of precision medicine, advancing everyday healthcare. This vision goes hand in hand with the groundbreaking advent of artificial intelligence (AI), which is reshaping the way we diagnose, treat, and monitor both clinical subjects and the general population. AI tools powered by machi… ▽ More Precision health, increasingly supported by digital technologies, is a domain of research that broadens the paradigm of precision medicine, advancing everyday healthcare. This vision goes hand in hand with the groundbreaking advent of artificial intelligence (AI), which is reshaping the way we diagnose, treat, and monitor both clinical subjects and the general population. AI tools powered by machine learning have shown considerable improvements in a variety of healthcare domains. In particular, reinforcement learning (RL) holds great promise for sequential and dynamic problems such as dynamic treatment regimes and just-in-time adaptive interventions in digital health. In this work, we discuss the opportunity offered by AI, more specifically RL, to current trends in healthcare, providing a methodological survey of RL methods in the context of precision and digital health. Focusing on the area of adaptive interventions, we expand the methodological survey with illustrative case studies that used RL in real practice. This invited article has undergone anonymous review and is intended as a book chapter for the volume "Frontiers of Statistics and Data Science" edited by Subhashis Ghoshal and Anindya Roy for the International Indian Statistical Association Series on Statistics and Data Science, published by Springer. It covers the material from a short course titled "Artificial Intelligence in Precision and Digital Health" taught by the author Bibhas Chakraborty at the IISA 2022 Conference, December 26-30 2022, at the Indian Institute of Science, Bengaluru. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: This contribution is a part of the volume "Frontiers of Statistics and Data Science" edited by Subhashis Ghoshal and Anindya Roy for the International Indian Statistical Association Series on Statistics and Data Science, published by Springer. It covers the material from a short course titled "Artificial Intelligence in Precision and Digital Health" (IISA 2022 Conference) and includes parts of arXiv:2203.02605

Journal ref: IISA Series on Statistics and Data Science, Frontiers of Statistics and Data Science, Ed. Subhashis Ghosal and Aninda Roy, 2024

arXiv:2407.06452 [pdf, other]

Exploiting Heterogeneity in Timescales for Sparse Recurrent Spiking Neural Networks for Energy-Efficient Edge Computing

Authors: Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: Spiking Neural Networks (SNNs) represent the forefront of neuromorphic computing, promising energy-efficient and biologically plausible models for complex tasks. This paper weaves together three groundbreaking studies that revolutionize SNN performance through the introduction of heterogeneity in neuron and synapse dynamics. We explore the transformative impact of Heterogeneous Recurrent Spiking N… ▽ More Spiking Neural Networks (SNNs) represent the forefront of neuromorphic computing, promising energy-efficient and biologically plausible models for complex tasks. This paper weaves together three groundbreaking studies that revolutionize SNN performance through the introduction of heterogeneity in neuron and synapse dynamics. We explore the transformative impact of Heterogeneous Recurrent Spiking Neural Networks (HRSNNs), supported by rigorous analytical frameworks and novel pruning methods like Lyapunov Noise Pruning (LNP). Our findings reveal how heterogeneity not only enhances classification performance but also reduces spiking activity, leading to more efficient and robust networks. By bridging theoretical insights with practical applications, this comprehensive summary highlights the potential of SNNs to outperform traditional neural networks while maintaining lower computational costs. Join us on a journey through the cutting-edge advancements that pave the way for the future of intelligent, energy-efficient neural computing. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 20 pages, 12 figures, 5 tables. arXiv admin note: text overlap with arXiv:2211.04297, arXiv:2302.11618, arXiv:2403.03409

arXiv:2405.06621 [pdf, other]

On Streaming Codes for Simultaneously Correcting Burst and Random Erasures

Authors: Shobhit Bhatnagar, Biswadip Chakraborty, P. Vijay Kumar

Abstract: Streaming codes are packet-level codes that recover dropped packets within a strict decoding-delay constraint. We study streaming codes over a sliding-window (SW) channel model which admits only those erasure patterns which allow either a single burst erasure of $\le b$ packets along with $\le e$ random packet erasures, or else, $\le a$ random packet erasures, in any sliding-window of $w$ time slo… ▽ More Streaming codes are packet-level codes that recover dropped packets within a strict decoding-delay constraint. We study streaming codes over a sliding-window (SW) channel model which admits only those erasure patterns which allow either a single burst erasure of $\le b$ packets along with $\le e$ random packet erasures, or else, $\le a$ random packet erasures, in any sliding-window of $w$ time slots. We determine the optimal rate of a streaming code constructed via the popular diagonal embedding (DE) technique over such a SW channel under delay constraint $τ=(w-1)$ and provide an $O(w)$ field size code construction. For the case $e>1$, we show that it is not possible to significantly reduce this field size requirement, assuming the well-known MDS conjecture. We then provide a block code construction whose DE yields a streaming code achieving the rate derived above, over a field of size sub-linear in $w,$ for a family of parameters having $e=1.$ We show the field size optimality of this construction for some parameters, and near-optimality for others under a sparsity constraint. Additionally, we derive an upper-bound on the $d_{\text{min}}$ of a cyclic code and characterize cyclic codes which achieve this bound via their ability to simultaneously recover from burst and random erasures. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2404.13125 [pdf, other]

Towards Robust Real-Time Hardware-based Mobile Malware Detection using Multiple Instance Learning Formulation

Authors: Harshit Kumar, Sudarshan Sharma, Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: This study introduces RT-HMD, a Hardware-based Malware Detector (HMD) for mobile devices, that refines malware representation in segmented time-series through a Multiple Instance Learning (MIL) approach. We address the mislabeling issue in real-time HMDs, where benign segments in malware time-series incorrectly inherit malware labels, leading to increased false positives. Utilizing the proposed Ma… ▽ More This study introduces RT-HMD, a Hardware-based Malware Detector (HMD) for mobile devices, that refines malware representation in segmented time-series through a Multiple Instance Learning (MIL) approach. We address the mislabeling issue in real-time HMDs, where benign segments in malware time-series incorrectly inherit malware labels, leading to increased false positives. Utilizing the proposed Malicious Discriminative Score within the MIL framework, RT-HMD effectively identifies localized malware behaviors, thereby improving the predictive accuracy. Empirical analysis, using a hardware telemetry dataset collected from a mobile platform across 723 benign and 1033 malware samples, shows a 5% precision boost while maintaining recall, outperforming baselines affected by mislabeled benign segments. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: Under peer review

arXiv:2404.06460 [pdf, other]

Learning Locally Interacting Discrete Dynamical Systems: Towards Data-Efficient and Scalable Prediction

Authors: Beomseok Kang, Harshit Kumar, Minah Lee, Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: Locally interacting dynamical systems, such as epidemic spread, rumor propagation through crowd, and forest fire, exhibit complex global dynamics originated from local, relatively simple, and often stochastic interactions between dynamic elements. Their temporal evolution is often driven by transitions between a finite number of discrete states. Despite significant advancements in predictive model… ▽ More Locally interacting dynamical systems, such as epidemic spread, rumor propagation through crowd, and forest fire, exhibit complex global dynamics originated from local, relatively simple, and often stochastic interactions between dynamic elements. Their temporal evolution is often driven by transitions between a finite number of discrete states. Despite significant advancements in predictive modeling through deep learning, such interactions among many elements have rarely explored as a specific domain for predictive modeling. We present Attentive Recurrent Neural Cellular Automata (AR-NCA), to effectively discover unknown local state transition rules by associating the temporal information between neighboring cells in a permutation-invariant manner. AR-NCA exhibits the superior generalizability across various system configurations (i.e., spatial distribution of states), data efficiency and robustness in extremely data-limited scenarios even in the presence of stochastic interactions, and scalability through spatial dimension-independent prediction. △ Less

Submitted 27 May, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: Accepted in Learning for Dynamics and Control Conference (L4DC) 2024

arXiv:2403.12462 [pdf, other]

Topological Representations of Heterogeneous Learning Dynamics of Recurrent Spiking Neural Networks

Authors: Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: Spiking Neural Networks (SNNs) have become an essential paradigm in neuroscience and artificial intelligence, providing brain-inspired computation. Recent advances in literature have studied the network representations of deep neural networks. However, there has been little work that studies representations learned by SNNs, especially using unsupervised local learning methods like spike-timing dep… ▽ More Spiking Neural Networks (SNNs) have become an essential paradigm in neuroscience and artificial intelligence, providing brain-inspired computation. Recent advances in literature have studied the network representations of deep neural networks. However, there has been little work that studies representations learned by SNNs, especially using unsupervised local learning methods like spike-timing dependent plasticity (STDP). Recent work by \cite{barannikov2021representation} has introduced a novel method to compare topological mappings of learned representations called Representation Topology Divergence (RTD). Though useful, this method is engineered particularly for feedforward deep neural networks and cannot be used for recurrent networks like Recurrent SNNs (RSNNs). This paper introduces a novel methodology to use RTD to measure the difference between distributed representations of RSNN models with different learning methods. We propose a novel reformulation of RSNNs using feedforward autoencoder networks with skip connections to help us compute the RTD for recurrent networks. Thus, we investigate the learning capabilities of RSNN trained using STDP and the role of heterogeneity in the synaptic dynamics in learning such representations. We demonstrate that heterogeneous STDP in RSNNs yield distinct representations than their homogeneous and surrogate gradient-based supervised learning counterparts. Our results provide insights into the potential of heterogeneous SNN models, aiding the development of more efficient and biologically plausible hybrid artificial intelligence systems. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: Accepted in IEEE World Congress on Computational Intelligence (IEEE WCCI) 2024

arXiv:2403.05235 [pdf]

Fairness-Aware Interpretable Modeling (FAIM) for Trustworthy Machine Learning in Healthcare

Authors: Mingxuan Liu, Yilin Ning, Yuhe Ke, Yuqing Shang, Bibhas Chakraborty, Marcus Eng Hock Ong, Roger Vaughan, Nan Liu

Abstract: The escalating integration of machine learning in high-stakes fields such as healthcare raises substantial concerns about model fairness. We propose an interpretable framework - Fairness-Aware Interpretable Modeling (FAIM), to improve model fairness without compromising performance, featuring an interactive interface to identify a "fairer" model from a set of high-performing models and promoting t… ▽ More The escalating integration of machine learning in high-stakes fields such as healthcare raises substantial concerns about model fairness. We propose an interpretable framework - Fairness-Aware Interpretable Modeling (FAIM), to improve model fairness without compromising performance, featuring an interactive interface to identify a "fairer" model from a set of high-performing models and promoting the integration of data-driven evidence and clinical expertise to enhance contextualized fairness. We demonstrated FAIM's value in reducing sex and race biases by predicting hospital admission with two real-world databases, MIMIC-IV-ED and SGH-ED. We show that for both datasets, FAIM models not only exhibited satisfactory discriminatory performance but also significantly mitigated biases as measured by well-established fairness metrics, outperforming commonly used bias-mitigation methods. Our approach demonstrates the feasibility of improving fairness without sacrificing performance and provides an a modeling mode that invites domain experts to engage, fostering a multidisciplinary effort toward tailored AI fairness. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05229 [pdf]

Developing Federated Time-to-Event Scores Using Heterogeneous Real-World Survival Data

Authors: Siqi Li, Yuqing Shang, Ziwen Wang, Qiming Wu, Chuan Hong, Yilin Ning, Di Miao, Marcus Eng Hock Ong, Bibhas Chakraborty, Nan Liu

Abstract: Survival analysis serves as a fundamental component in numerous healthcare applications, where the determination of the time to specific events (such as the onset of a certain disease or death) for patients is crucial for clinical decision-making. Scoring systems are widely used for swift and efficient risk prediction. However, existing methods for constructing survival scores presume that data or… ▽ More Survival analysis serves as a fundamental component in numerous healthcare applications, where the determination of the time to specific events (such as the onset of a certain disease or death) for patients is crucial for clinical decision-making. Scoring systems are widely used for swift and efficient risk prediction. However, existing methods for constructing survival scores presume that data originates from a single source, posing privacy challenges in collaborations with multiple data owners. We propose a novel framework for building federated scoring systems for multi-site survival outcomes, ensuring both privacy and communication efficiency. We applied our approach to sites with heterogeneous survival data originating from emergency departments in Singapore and the United States. Additionally, we independently developed local scores at each site. In testing datasets from each participant site, our proposed federated scoring system consistently outperformed all local models, evidenced by higher integrated area under the receiver operating characteristic curve (iAUC) values, with a maximum improvement of 11.6%. Additionally, the federated score's time-dependent AUC(t) values showed advantages over local scores, exhibiting narrower confidence intervals (CIs) across most time points. The model developed through our proposed method exhibits effective performance on each local site, signifying noteworthy implications for healthcare research. Sites participating in our proposed federated scoring model training gained benefits by acquiring survival models with enhanced prediction accuracy and efficiency. This study demonstrates the effectiveness of our privacy-preserving federated survival score generation framework and its applicability to real-world heterogeneous survival data. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.03409 [pdf, other]

Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN

Authors: Biswadeep Chakraborty, Beomseok Kang, Harshit Kumar, Saibal Mukhopadhyay

Abstract: Recurrent Spiking Neural Networks (RSNNs) have emerged as a computationally efficient and brain-inspired learning model. The design of sparse RSNNs with fewer neurons and synapses helps reduce the computational complexity of RSNNs. Traditionally, sparse SNNs are obtained by first training a dense and complex SNN for a target task, and, then, pruning neurons with low activity (activity-based prunin… ▽ More Recurrent Spiking Neural Networks (RSNNs) have emerged as a computationally efficient and brain-inspired learning model. The design of sparse RSNNs with fewer neurons and synapses helps reduce the computational complexity of RSNNs. Traditionally, sparse SNNs are obtained by first training a dense and complex SNN for a target task, and, then, pruning neurons with low activity (activity-based pruning) while maintaining task performance. In contrast, this paper presents a task-agnostic methodology for designing sparse RSNNs by pruning a large randomly initialized model. We introduce a novel Lyapunov Noise Pruning (LNP) algorithm that uses graph sparsification methods and utilizes Lyapunov exponents to design a stable sparse RSNN from a randomly initialized RSNN. We show that the LNP can leverage diversity in neuronal timescales to design a sparse Heterogeneous RSNN (HRSNN). Further, we show that the same sparse HRSNN model can be trained for different tasks, such as image classification and temporal prediction. We experimentally show that, in spite of being task-agnostic, LNP increases computational efficiency (fewer neurons and synapses) and prediction performance of RSNNs compared to traditional activity-based pruning of trained dense models. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: Published as a conference paper at ICLR 2024

Journal ref: ICLR 2024

arXiv:2402.15163 [pdf, other]

Has the Deep Neural Network learned the Stochastic Process? A Wildfire Perspective

Authors: Harshit Kumar, Beomseok Kang, Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: This paper presents the first systematic study of evalution of Deep Neural Network (DNN) designed and trained to predict the evolution of a stochastic dynamical system, using wildfire prediction as a case study. We show that traditional evaluation methods based on threshold based classification metrics and error-based scoring rules assess a DNN's ability to replicate the observed ground truth (GT)… ▽ More This paper presents the first systematic study of evalution of Deep Neural Network (DNN) designed and trained to predict the evolution of a stochastic dynamical system, using wildfire prediction as a case study. We show that traditional evaluation methods based on threshold based classification metrics and error-based scoring rules assess a DNN's ability to replicate the observed ground truth (GT), but do not measure the fidelity of the DNN's learning of the underlying stochastic process. To address this gap, we propose a new system property: Statistic-GT, representing the GT of the stochastic process, and an evaluation metric that exclusively assesses fidelity to Statistic-GT. Utilizing a synthetic dataset, we introduce a stochastic framework to characterize this property and establish criteria for a metric to be a valid measure of the proposed property. We formally show that Expected Calibration Error (ECE) tests the necessary condition for fidelity to Statistic-GT. We perform empirical experiments, differentiating ECE's behavior from conventional metrics and demonstrate that ECE exclusively measures fidelity to the stochastic process. Extending our analysis to real-world wildfire data, we highlight the limitations of traditional evaluation methods and discuss the utility of evaluating fidelity to the stochastic process alongside existing metrics. △ Less

Submitted 22 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: Under peer review

arXiv:2401.14522 [pdf, other]

STEMFold: Stochastic Temporal Manifold for Multi-Agent Interactions in the Presence of Hidden Agents

Authors: Hemant Kumawat, Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: Learning accurate, data-driven predictive models for multiple interacting agents following unknown dynamics is crucial in many real-world physical and social systems. In many scenarios, dynamics prediction must be performed under incomplete observations, i.e., only a subset of agents are known and observable from a larger topological system while the behaviors of the unobserved agents and their in… ▽ More Learning accurate, data-driven predictive models for multiple interacting agents following unknown dynamics is crucial in many real-world physical and social systems. In many scenarios, dynamics prediction must be performed under incomplete observations, i.e., only a subset of agents are known and observable from a larger topological system while the behaviors of the unobserved agents and their interactions with the observed agents are not known. When only incomplete observations of a dynamical system are available, so that some states remain hidden, it is generally not possible to learn a closed-form model in these variables using either analytic or data-driven techniques. In this work, we propose STEMFold, a spatiotemporal attention-based generative model, to learn a stochastic manifold to predict the underlying unmeasured dynamics of the multi-agent system from observations of only visible agents. Our analytical results motivate STEMFold design using a spatiotemporal graph with time anchors to effectively map the observations of visible agents to a stochastic manifold with no prior information about interaction graph topology. We empirically evaluated our method on two simulations and two real-world datasets, where it outperformed existing networks in predicting complex multiagent interactions, even with many unobserved agents. △ Less

Submitted 2 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: Accepted as a conference paper at $6^{th}$ Annual Learning for Dynamics & Control Conference 2024

arXiv:2312.05878 [pdf, other]

Skew Probabilistic Neural Networks for Learning from Imbalanced Data

Authors: Shraddha M. Naik, Tanujit Chakraborty, Abdenour Hadid, Bibhas Chakraborty

Abstract: Real-world datasets often exhibit imbalanced data distribution, where certain class levels are severely underrepresented. In such cases, traditional pattern classifiers have shown a bias towards the majority class, impeding accurate predictions for the minority class. This paper introduces an imbalanced data-oriented approach using probabilistic neural networks (PNNs) with a skew normal probabilit… ▽ More Real-world datasets often exhibit imbalanced data distribution, where certain class levels are severely underrepresented. In such cases, traditional pattern classifiers have shown a bias towards the majority class, impeding accurate predictions for the minority class. This paper introduces an imbalanced data-oriented approach using probabilistic neural networks (PNNs) with a skew normal probability kernel to address this major challenge. PNNs are known for providing probabilistic outputs, enabling quantification of prediction confidence and uncertainty handling. By leveraging the skew normal distribution, which offers increased flexibility, particularly for imbalanced and non-symmetric data, our proposed Skew Probabilistic Neural Networks (SkewPNNs) can better represent underlying class densities. To optimize the performance of the proposed approach on imbalanced datasets, hyperparameter fine-tuning is imperative. To this end, we employ a population-based heuristic algorithm, Bat optimization algorithms, for effectively exploring the hyperparameter space. We also prove the statistical consistency of the density estimates which suggests that the true distribution will be approached smoothly as the sample size increases. Experimental simulations have been conducted on different synthetic datasets, comparing various benchmark-imbalanced learners. Our real-data analysis shows that SkewPNNs substantially outperform state-of-the-art machine learning methods for both balanced and imbalanced datasets in most experimental settings. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2311.14359 [pdf, other]

Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study

Authors: Xueqing Liu, Nina Deliu, Tanujit Chakraborty, Lauren Bell, Bibhas Chakraborty

Abstract: Mobile health (mHealth) interventions often aim to improve distal outcomes, such as clinical conditions, by optimizing proximal outcomes through just-in-time adaptive interventions. Contextual bandits provide a suitable framework for customizing such interventions according to individual time-varying contexts. However, unique challenges, such as modeling count outcomes within bandit frameworks, ha… ▽ More Mobile health (mHealth) interventions often aim to improve distal outcomes, such as clinical conditions, by optimizing proximal outcomes through just-in-time adaptive interventions. Contextual bandits provide a suitable framework for customizing such interventions according to individual time-varying contexts. However, unique challenges, such as modeling count outcomes within bandit frameworks, have hindered the widespread application of contextual bandits to mHealth studies. The current work addresses this challenge by leveraging count data models into online decision-making approaches. Specifically, we combine four common offline count data models (Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regressions) with Thompson sampling, a popular contextual bandit algorithm. The proposed algorithms are motivated by and evaluated on a real dataset from the Drink Less trial, where they are shown to improve user engagement with the mHealth platform. The proposed methods are further evaluated on simulated data, achieving improvement in maximizing cumulative proximal outcomes over existing algorithms. Theoretical results on regret bounds are also derived. The countts R package provides an implementation of our approach. △ Less

Submitted 29 July, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

arXiv:2304.07310 [pdf]

doi 10.1093/jamia/ocad170

Federated and distributed learning applications for electronic health records and structured medical data: A scoping review

Authors: Siqi Li, Pinyan Liu, Gustavo G. Nascimento, Xinru Wang, Fabio Renato Manzolli Leite, Bibhas Chakraborty, Chuan Hong, Yilin Ning, Feng Xie, Zhen Ling Teo, Daniel Shu Wei Ting, Hamed Haddadi, Marcus Eng Hock Ong, Marco Aurélio Peres, Nan Liu

Abstract: Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medi… ▽ More Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medical data, identifies contemporary limitations and discusses potential innovations. We searched five databases, SCOPUS, MEDLINE, Web of Science, Embase, and CINAHL, to identify articles that applied FL to structured medical data and reported results following the PRISMA guidelines. Each selected publication was evaluated from three primary perspectives, including data quality, modeling strategies, and FL frameworks. Out of the 1160 papers screened, 34 met the inclusion criteria, with each article consisting of one or more studies that used FL to handle structured clinical/medical data. Of these, 24 utilized data acquired from electronic health records, with clinical predictions and association studies being the most common clinical research tasks that FL was applied to. Only one article exclusively explored the vertical FL setting, while the remaining 33 explored the horizontal FL setting, with only 14 discussing comparisons between single-site (local) and FL (global) analysis. The existing FL applications on structured medical data lack sufficient evaluations of clinically meaningful benefits, particularly when compared to single-site analyses. Therefore, it is crucial for future FL applications to prioritize clinical motivations and develop designs and methodologies that can effectively support and aid clinical practice and research. △ Less

Submitted 14 April, 2023; originally announced April 2023.

arXiv:2304.04697 [pdf, other]

Brain-Inspired Spiking Neural Network for Online Unsupervised Time Series Prediction

Authors: Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: Energy and data-efficient online time series prediction for predicting evolving dynamical systems are critical in several fields, especially edge AI applications that need to update continuously based on streaming data. However, current DNN-based supervised online learning models require a large amount of training data and cannot quickly adapt when the underlying system changes. Moreover, these mo… ▽ More Energy and data-efficient online time series prediction for predicting evolving dynamical systems are critical in several fields, especially edge AI applications that need to update continuously based on streaming data. However, current DNN-based supervised online learning models require a large amount of training data and cannot quickly adapt when the underlying system changes. Moreover, these models require continuous retraining with incoming data making them highly inefficient. To solve these issues, we present a novel Continuous Learning-based Unsupervised Recurrent Spiking Neural Network Model (CLURSNN), trained with spike timing dependent plasticity (STDP). CLURSNN makes online predictions by reconstructing the underlying dynamical system using Random Delay Embedding by measuring the membrane potential of neurons in the recurrent layer of the RSNN with the highest betweenness centrality. We also use topological data analysis to propose a novel methodology using the Wasserstein Distance between the persistence homologies of the predicted and observed time series as a loss function. We show that the proposed online time series prediction methodology outperforms state-of-the-art DNN models when predicting an evolving Lorenz63 dynamical system. △ Less

Submitted 31 May, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

Comments: Manuscript accepted to be published in IJCNN 2023

arXiv:2303.00282 [pdf]

doi 10.1016/j.jbi.2023.104485

FedScore: A privacy-preserving framework for federated scoring system development

Authors: Siqi Li, Yilin Ning, Marcus Eng Hock Ong, Bibhas Chakraborty, Chuan Hong, Feng Xie, Han Yuan, Mingxuan Liu, Daniel M. Buckland, Yong Chen, Nan Liu

Abstract: We propose FedScore, a privacy-preserving federated learning framework for scoring system generation across multiple sites to facilitate cross-institutional collaborations. The FedScore framework includes five modules: federated variable ranking, federated variable transformation, federated score derivation, federated model selection and federated model evaluation. To illustrate usage and assess F… ▽ More We propose FedScore, a privacy-preserving federated learning framework for scoring system generation across multiple sites to facilitate cross-institutional collaborations. The FedScore framework includes five modules: federated variable ranking, federated variable transformation, federated score derivation, federated model selection and federated model evaluation. To illustrate usage and assess FedScore's performance, we built a hypothetical global scoring system for mortality prediction within 30 days after a visit to an emergency department using 10 simulated sites divided from a tertiary hospital in Singapore. We employed a pre-existing score generator to construct 10 local scoring systems independently at each site and we also developed a scoring system using centralized data for comparison. We compared the acquired FedScore model's performance with that of other scoring models using the receiver operating characteristic (ROC) analysis. The FedScore model achieved an average area under the curve (AUC) value of 0.763 across all sites, with a standard deviation (SD) of 0.020. We also calculated the average AUC values and SDs for each local model, and the FedScore model showed promising accuracy and stability with a high average AUC value which was closest to the one of the pooled model and SD which was lower than that of most local models. This study demonstrates that FedScore is a privacy-preserving scoring system generator with potentially good generalizability. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2302.11622 [pdf, other]

Unsupervised 3D Object Learning through Neuron Activity aware Plasticity

Authors: Beomseok Kang, Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: We present an unsupervised deep learning model for 3D object classification. Conventional Hebbian learning, a well-known unsupervised model, suffers from loss of local features leading to reduced performance for tasks with complex geometric objects. We present a deep network with a novel Neuron Activity Aware (NeAW) Hebbian learning rule that dynamically switches the neurons to be governed by Hebb… ▽ More We present an unsupervised deep learning model for 3D object classification. Conventional Hebbian learning, a well-known unsupervised model, suffers from loss of local features leading to reduced performance for tasks with complex geometric objects. We present a deep network with a novel Neuron Activity Aware (NeAW) Hebbian learning rule that dynamically switches the neurons to be governed by Hebbian learning or anti-Hebbian learning, depending on its activity. We analytically show that NeAW Hebbian learning relieves the bias in neuron activity, allowing more neurons to attend to the representation of the 3D objects. Empirical results show that the NeAW Hebbian learning outperforms other variants of Hebbian learning and shows higher accuracy over fully supervised models when training data is limited. △ Less

Submitted 22 February, 2023; originally announced February 2023.

Comments: Published as a conference paper at ICLR 2023

arXiv:2302.11618 [pdf, other]

Heterogeneous Neuronal and Synaptic Dynamics for Spike-Efficient Unsupervised Learning: Theory and Design Principles

Authors: Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: This paper shows that the heterogeneity in neuronal and synaptic dynamics reduces the spiking activity of a Recurrent Spiking Neural Network (RSNN) while improving prediction performance, enabling spike-efficient (unsupervised) learning. We analytically show that the diversity in neurons' integration/relaxation dynamics improves an RSNN's ability to learn more distinct input patterns (higher memor… ▽ More This paper shows that the heterogeneity in neuronal and synaptic dynamics reduces the spiking activity of a Recurrent Spiking Neural Network (RSNN) while improving prediction performance, enabling spike-efficient (unsupervised) learning. We analytically show that the diversity in neurons' integration/relaxation dynamics improves an RSNN's ability to learn more distinct input patterns (higher memory capacity), leading to improved classification and prediction performance. We further prove that heterogeneous Spike-Timing-Dependent-Plasticity (STDP) dynamics of synapses reduce spiking activity but preserve memory capacity. The analytical results motivate Heterogeneous RSNN design using Bayesian optimization to determine heterogeneity in neurons and synapses to improve $\mathcal{E}$, defined as the ratio of spiking activity and memory capacity. The empirical results on time series classification and prediction tasks show that optimized HRSNN increases performance and reduces spiking activity compared to a homogeneous RSNN. △ Less

Submitted 5 August, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: Paper Published in ICLR 2023 (https://openreview.net/forum?id=QIRtAqoXwj)

Journal ref: The Eleventh International Conference on Learning Representations 2023

arXiv:2211.04297 [pdf, other]

Heterogeneous Recurrent Spiking Neural Network for Spatio-Temporal Classification

Authors: Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: Spiking Neural Networks are often touted as brain-inspired learning models for the third wave of Artificial Intelligence. Although recent SNNs trained with supervised backpropagation show classification accuracy comparable to deep networks, the performance of unsupervised learning-based SNNs remains much lower. This paper presents a heterogeneous recurrent spiking neural network (HRSNN) with unsup… ▽ More Spiking Neural Networks are often touted as brain-inspired learning models for the third wave of Artificial Intelligence. Although recent SNNs trained with supervised backpropagation show classification accuracy comparable to deep networks, the performance of unsupervised learning-based SNNs remains much lower. This paper presents a heterogeneous recurrent spiking neural network (HRSNN) with unsupervised learning for spatio-temporal classification of video activity recognition tasks on RGB (KTH, UCF11, UCF101) and event-based datasets (DVS128 Gesture). The key novelty of the HRSNN is that the recurrent layer in HRSNN consists of heterogeneous neurons with varying firing/relaxation dynamics, and they are trained via heterogeneous spike-time-dependent-plasticity (STDP) with varying learning dynamics for each synapse. We show that this novel combination of heterogeneity in architecture and learning method outperforms current homogeneous spiking neural networks. We further show that HRSNN can achieve similar performance to state-of-the-art backpropagation trained supervised SNN, but with less computation (fewer neurons and sparse connection) and less training data. △ Less

Submitted 22 September, 2022; originally announced November 2022.

Comments: 32 pages, 11 Figures, 4 Tables. arXiv admin note: text overlap with arXiv:1511.03198 by other authors

arXiv:2210.08258 [pdf]

doi 10.1016/j.artmed.2023.102587

Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques

Authors: Mingxuan Liu, Siqi Li, Han Yuan, Marcus Eng Hock Ong, Yilin Ning, Feng Xie, Seyed Ehsan Saffari, Victor Volovici, Bibhas Chakraborty, Nan Liu

Abstract: Objective: The proper handling of missing values is critical to delivering reliable estimates and decisions, especially in high-stakes fields such as clinical research. The increasing diversity and complexity of data have led many researchers to develop deep learning (DL)-based imputation techniques. We conducted a systematic review to evaluate the use of these techniques, with a particular focus… ▽ More Objective: The proper handling of missing values is critical to delivering reliable estimates and decisions, especially in high-stakes fields such as clinical research. The increasing diversity and complexity of data have led many researchers to develop deep learning (DL)-based imputation techniques. We conducted a systematic review to evaluate the use of these techniques, with a particular focus on data types, aiming to assist healthcare researchers from various disciplines in dealing with missing values. Methods: We searched five databases (MEDLINE, Web of Science, Embase, CINAHL, and Scopus) for articles published prior to August 2021 that applied DL-based models to imputation. We assessed selected publications from four perspectives: health data types, model backbone (i.e., main architecture), imputation strategies, and comparison with non-DL-based methods. Based on data types, we created an evidence map to illustrate the adoption of DL models. Results: We included 64 articles, of which tabular static (26.6%, 17/64) and temporal data (37.5%, 24/64) were the most frequently investigated. We found that model backbone(s) differed among data types as well as the imputation strategy. The "integrated" strategy, that is, the imputation task being solved concurrently with downstream tasks, was popular for tabular temporal (50%, 12/24) and multi-modal data (71.4%, 5/7), but limited for other data types. Moreover, DL-based imputation methods yielded better imputation accuracy in most studies, compared with non-DL-based methods. Conclusion: DL-based imputation models can be customized based on data type, addressing the corresponding missing patterns, and its associated "integrated" strategy can enhance the efficacy of imputation, especially in scenarios where data is complex. Future research may focus on the portability and fairness of DL-based models for healthcare data imputation. △ Less

Submitted 15 October, 2022; originally announced October 2022.

arXiv:2206.12447 [pdf, other]

doi 10.1109/TIFS.2023.3318969

XMD: An Expansive Hardware-telemetry based Mobile Malware Detector to enhance Endpoint Detection

Authors: Harshit Kumar, Biswadeep Chakraborty, Sudarshan Sharma, Saibal Mukhopadhyay

Abstract: Hardware-based Malware Detectors (HMDs) have shown promise in detecting malicious workloads. However, the current HMDs focus solely on the CPU core of a System-on-Chip (SoC) and, therefore, do not exploit the full potential of the hardware telemetry. In this paper, we propose XMD, an HMD that uses an expansive set of telemetry channels extracted from the different subsystems of SoC. XMD exploits t… ▽ More Hardware-based Malware Detectors (HMDs) have shown promise in detecting malicious workloads. However, the current HMDs focus solely on the CPU core of a System-on-Chip (SoC) and, therefore, do not exploit the full potential of the hardware telemetry. In this paper, we propose XMD, an HMD that uses an expansive set of telemetry channels extracted from the different subsystems of SoC. XMD exploits the thread-level profiling power of the CPU-core telemetry, and the global profiling power of non-core telemetry channels, to achieve significantly better detection performance than currently used Hardware Performance Counter (HPC) based detectors. We leverage the concept of manifold hypothesis to analytically prove that adding non-core telemetry channels improves the separability of the benign and malware classes, resulting in performance gains. We train and evaluate XMD using hardware telemetries collected from 723 benign applications and 1033 malware samples on a commodity Android Operating System (OS)-based mobile device. XMD improves over currently used HPC-based detectors by 32.91% for the in-distribution test data. XMD achieves the best detection performance of 86.54% with a false positive rate of 2.9%, compared to the detection rate of 80%, offered by the best performing signature-based Anti-Virus(AV) on VirusTotal, on the same set of malware samples. △ Less

Submitted 14 September, 2023; v1 submitted 24 June, 2022; originally announced June 2022.

Comments: Revised version based on peer review feedback. Manuscript to appear in IEEE Transactions on Information Forensics and Security

arXiv:2203.02605 [pdf, other]

doi 10.1111/insr.12583

Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive Interventions

Authors: Nina Deliu, Joseph Jay Williams, Bibhas Chakraborty

Abstract: In recent years, reinforcement learning (RL) has acquired a prominent position in health-related sequential decision-making problems, gaining traction as a valuable tool for delivering adaptive interventions (AIs). However, in part due to a poor synergy between the methodological and the applied communities, its real-life application is still limited and its potential is still to be realized. To a… ▽ More In recent years, reinforcement learning (RL) has acquired a prominent position in health-related sequential decision-making problems, gaining traction as a valuable tool for delivering adaptive interventions (AIs). However, in part due to a poor synergy between the methodological and the applied communities, its real-life application is still limited and its potential is still to be realized. To address this gap, our work provides the first unified technical survey on RL methods, complemented with case studies, for constructing various types of AIs in healthcare. In particular, using the common methodological umbrella of RL, we bridge two seemingly different AI domains, dynamic treatment regimes and just-in-time adaptive interventions in mobile health, highlighting similarities and differences between them and discussing the implications of using RL. Open problems and considerations for future research directions are outlined. Finally, we leverage our experience in designing case studies in both areas to showcase the significant collaborative opportunities between statistical, RL, and healthcare researchers in advancing AIs. △ Less

Submitted 11 May, 2024; v1 submitted 4 March, 2022; originally announced March 2022.

Comments: 57 pages

Journal ref: International Statistical Review (2024)

arXiv:2202.08407 [pdf]

AutoScore-Ordinal: An interpretable machine learning framework for generating scoring models for ordinal outcomes

Authors: Seyed Ehsan Saffari, Yilin Ning, Xie Feng, Bibhas Chakraborty, Victor Volovici, Roger Vaughan, Marcus Eng Hock Ong, Nan Liu

Abstract: Background: Risk prediction models are useful tools in clinical decision-making which help with risk stratification and resource allocations and may lead to a better health care for patients. AutoScore is a machine learning-based automatic clinical score generator for binary outcomes. This study aims to expand the AutoScore framework to provide a tool for interpretable risk prediction for ordinal… ▽ More Background: Risk prediction models are useful tools in clinical decision-making which help with risk stratification and resource allocations and may lead to a better health care for patients. AutoScore is a machine learning-based automatic clinical score generator for binary outcomes. This study aims to expand the AutoScore framework to provide a tool for interpretable risk prediction for ordinal outcomes. Methods: The AutoScore-Ordinal framework is generated using the same 6 modules of the original AutoScore algorithm including variable ranking, variable transformation, score derivation (from proportional odds models), model selection, score fine-tuning, and model evaluation. To illustrate the AutoScore-Ordinal performance, the method was conducted on electronic health records data from the emergency department at Singapore General Hospital over 2008 to 2017. The model was trained on 70% of the data, validated on 10% and tested on the remaining 20%. Results: This study included 445,989 inpatient cases, where the distribution of the ordinal outcome was 80.7% alive without 30-day readmission, 12.5% alive with 30-day readmission, and 6.8% died inpatient or by day 30 post discharge. Two point-based risk prediction models were developed using two sets of 8 predictor variables identified by the flexible variable selection procedure. The two models indicated reasonably good performance measured by mean area under the receiver operating characteristic curve (0.785 and 0.793) and generalized c-index (0.737 and 0.760), which were comparable to alternative models. Conclusion: AutoScore-Ordinal provides an automated and easy-to-use framework for development and validation of risk prediction models for ordinal outcomes, which can systematically identify potential predictors from high-dimensional data. △ Less

Submitted 16 February, 2022; originally announced February 2022.

arXiv:2201.03291 [pdf]

A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

Authors: Yilin Ning, Siqi Li, Marcus Eng Hock Ong, Feng Xie, Bibhas Chakraborty, Daniel Shu Wei Ting, Nan Liu

Abstract: Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors, but such 'black box' variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach usin… ▽ More Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors, but such 'black box' variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach using the recently developed Shapley variable importance cloud (ShapleyVIC) that accounts for variability across models. Our approach evaluates and visualizes overall variable contributions for in-depth inference and transparent variable selection, and filters out non-significant contributors to simplify model building steps. We derive an ensemble variable ranking from variable contributions, which is easily integrated with an automated and modularized risk score generator, AutoScore, for convenient implementation. In a study of early death or unplanned readmission, ShapleyVIC selected 6 of 41 candidate variables to create a well-performing model, which had similar performance to a 16-variable model from machine-learning-based ranking. △ Less

Submitted 10 January, 2022; originally announced January 2022.

arXiv:2111.11017 [pdf]

Benchmarking emergency department triage prediction models with machine learning and large public electronic health records

Authors: Feng Xie, Jun Zhou, Jin Wee Lee, Mingrui Tan, Siqi Li, Logasan S/O Rajnthern, Marcel Lucas Chee, Bibhas Chakraborty, An-Kwok Ian Wong, Alon Dagan, Marcus Eng Hock Ong, Fei Gao, Nan Liu

Abstract: The demand for emergency department (ED) services is increasing across the globe, particularly during the current COVID-19 pandemic. Clinical triage and risk assessment have become increasingly challenging due to the shortage of medical resources and the strain on hospital infrastructure caused by the pandemic. As a result of the widespread use of electronic health records (EHRs), we now have acce… ▽ More The demand for emergency department (ED) services is increasing across the globe, particularly during the current COVID-19 pandemic. Clinical triage and risk assessment have become increasingly challenging due to the shortage of medical resources and the strain on hospital infrastructure caused by the pandemic. As a result of the widespread use of electronic health records (EHRs), we now have access to a vast amount of clinical data, which allows us to develop predictive models and decision support systems to address these challenges. To date, however, there are no widely accepted benchmark ED triage prediction models based on large-scale public EHR data. An open-source benchmarking platform would streamline research workflows by eliminating cumbersome data preprocessing, and facilitate comparisons among different studies and methodologies. In this paper, based on the Medical Information Mart for Intensive Care IV Emergency Department (MIMIC-IV-ED) database, we developed a publicly available benchmark suite for ED triage predictive models and created a benchmark dataset that contains over 400,000 ED visits from 2011 to 2019. We introduced three ED-based outcomes (hospitalization, critical outcomes, and 72-hour ED reattendance) and implemented a variety of popular methodologies, ranging from machine learning methods to clinical scoring systems. We evaluated and compared the performance of these methods against benchmark tasks. Our codes are open-source, allowing anyone with MIMIC-IV-ED data access to perform the same steps in data processing, benchmark model building, and experiments. This study provides future researchers with insights, suggestions, and protocols for managing raw data and developing risk triaging tools for emergency care. △ Less

Submitted 20 March, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

arXiv:2110.02484 [pdf]

Shapley variable importance clouds for interpretable machine learning

Authors: Yilin Ning, Marcus Eng Hock Ong, Bibhas Chakraborty, Benjamin Alan Goldstein, Daniel Shu Wei Ting, Roger Vaughan, Nan Liu

Abstract: Interpretable machine learning has been focusing on explaining final models that optimize performance. The current state-of-the-art is the Shapley additive explanations (SHAP) that locally explains variable impact on individual predictions, and it is recently extended for a global assessment across the dataset. Recently, Dong and Rudin proposed to extend the investigation to models from the same c… ▽ More Interpretable machine learning has been focusing on explaining final models that optimize performance. The current state-of-the-art is the Shapley additive explanations (SHAP) that locally explains variable impact on individual predictions, and it is recently extended for a global assessment across the dataset. Recently, Dong and Rudin proposed to extend the investigation to models from the same class as the final model that are "good enough", and identified a previous overclaim of variable importance based on a single model. However, this method does not directly integrate with existing Shapley-based interpretations. We close this gap by proposing a Shapley variable importance cloud that pools information across good models to avoid biased assessments in SHAP analyses of final models, and communicate the findings via novel visualizations. We demonstrate the additional insights gain compared to conventional explanations and Dong and Rudin's method using criminal justice and electronic medical records data. △ Less

Submitted 5 October, 2021; originally announced October 2021.

arXiv:2107.11500 [pdf, other]

$μ$DARTS: Model Uncertainty-Aware Differentiable Architecture Search

Authors: Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: We present a Model Uncertainty-aware Differentiable ARchiTecture Search ($μ$DARTS) that optimizes neural networks to simultaneously achieve high accuracy and low uncertainty. We introduce concrete dropout within DARTS cells and include a Monte-Carlo regularizer within the training loss to optimize the concrete dropout probabilities. A predictive variance term is introduced in the validation loss t… ▽ More We present a Model Uncertainty-aware Differentiable ARchiTecture Search ($μ$DARTS) that optimizes neural networks to simultaneously achieve high accuracy and low uncertainty. We introduce concrete dropout within DARTS cells and include a Monte-Carlo regularizer within the training loss to optimize the concrete dropout probabilities. A predictive variance term is introduced in the validation loss to enable searching for architecture with minimal model uncertainty. The experiments on CIFAR10, CIFAR100, SVHN, and ImageNet verify the effectiveness of $μ$DARTS in improving accuracy and reducing uncertainty compared to existing DARTS methods. Moreover, the final architecture obtained from $μ$DARTS shows higher robustness to noise at the input image and model parameters compared to the architecture obtained from existing DARTS methods. △ Less

Submitted 11 September, 2022; v1 submitted 23 July, 2021; originally announced July 2021.

Comments: 10 pages, 7 Tables, 6 Figures, Accepted in IEEE ACCESS

Journal ref: IEEE ACCESS 2022

arXiv:2107.09951 [pdf]

doi 10.1016/j.jbi.2021.103980

Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies

Authors: Feng Xie, Han Yuan, Yilin Ning, Marcus Eng Hock Ong, Mengling Feng, Wynne Hsu, Bibhas Chakraborty, Nan Liu

Abstract: Objective: Temporal electronic health records (EHRs) can be a wealth of information for secondary uses, such as clinical events prediction or chronic disease management. However, challenges exist for temporal data representation. We therefore sought to identify these challenges and evaluate novel methodologies for addressing them through a systematic examination of deep learning solutions. Metho… ▽ More Objective: Temporal electronic health records (EHRs) can be a wealth of information for secondary uses, such as clinical events prediction or chronic disease management. However, challenges exist for temporal data representation. We therefore sought to identify these challenges and evaluate novel methodologies for addressing them through a systematic examination of deep learning solutions. Methods: We searched five databases (PubMed, EMBASE, the Institute of Electrical and Electronics Engineers [IEEE] Xplore Digital Library, the Association for Computing Machinery [ACM] digital library, and Web of Science) complemented with hand-searching in several prestigious computer science conference proceedings. We sought articles that reported deep learning methodologies on temporal data representation in structured EHR data from January 1, 2010, to August 30, 2020. We summarized and analyzed the selected articles from three perspectives: nature of time series, methodology, and model implementation. Results: We included 98 articles related to temporal data representation using deep learning. Four major challenges were identified, including data irregularity, data heterogeneity, data sparsity, and model opacity. We then studied how deep learning techniques were applied to address these challenges. Finally, we discuss some open challenges arising from deep learning. Conclusion: Temporal EHR data present several major challenges for clinical prediction modeling and data utilization. To some extent, current deep learning solutions can address these challenges. Future studies can consider designing comprehensive and integrated solutions. Moreover, researchers should incorporate additional clinical domain knowledge into study designs and enhance the interpretability of the model to facilitate its implementation in clinical practice. △ Less

Submitted 21 July, 2021; originally announced July 2021.

arXiv:2107.07875 [pdf, other]

A Penalized Shared-parameter Algorithm for Estimating Optimal Dynamic Treatment Regimens

Authors: Palash Ghosh, Trikay Nalamada, Shruti Agarwal, Maria Jahja, Bibhas Chakraborty

Abstract: A dynamic treatment regimen (DTR) is a set of decision rules to personalize treatments for an individual using their medical history. The Q-learning based Q-shared algorithm has been used to develop DTRs that involve decision rules shared across multiple stages of intervention. We show that the existing Q-shared algorithm can suffer from non-convergence due to the use of linear models in the Q-lea… ▽ More A dynamic treatment regimen (DTR) is a set of decision rules to personalize treatments for an individual using their medical history. The Q-learning based Q-shared algorithm has been used to develop DTRs that involve decision rules shared across multiple stages of intervention. We show that the existing Q-shared algorithm can suffer from non-convergence due to the use of linear models in the Q-learning setup, and identify the condition in which Q-shared fails. Leveraging properties from expansion-constrained ordinary least-squares, we give a penalized Q-shared algorithm that not only converges in settings that violate the condition, but can outperform the original Q-shared algorithm even when the condition is satisfied. We give evidence for the proposed method in a real-world application and several synthetic simulations. △ Less

Submitted 26 May, 2022; v1 submitted 13 July, 2021; originally announced July 2021.

arXiv:2107.06039 [pdf]

doi 10.1016/j.jbi.2022.104072

AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data

Authors: Han Yuan, Feng Xie, Marcus Eng Hock Ong, Yilin Ning, Marcel Lucas Chee, Seyed Ehsan Saffari, Hairil Rizal Abdullah, Benjamin Alan Goldstein, Bibhas Chakraborty, Nan Liu

Abstract: Background: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among a wide variety of decision-making models for determining the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. Its current framework, however, still leaves room for… ▽ More Background: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among a wide variety of decision-making models for determining the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. Its current framework, however, still leaves room for improvement when addressing unbalanced data of rare events. Methods: Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. All scoring models were evaluated on the basis of their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches in the prediction of inpatient mortality. Results: AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839) while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.800). The AutoScore-Imbalance sub-model (using down-sampling algorithm) yielded an AUC of 0. 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Conclusions: The AutoScore-Imbalance tool has the potential to be applied to highly unbalanced datasets to gain further insight into rare medical events and to facilitate real-world clinical decision-making. △ Less

Submitted 13 July, 2021; originally announced July 2021.

arXiv:2106.06957 [pdf]

doi 10.1016/j.jbi.2021.103959

AutoScore-Survival: Developing interpretable machine learning-based time-to-event scores with right-censored survival data

Authors: Feng Xie, Yilin Ning, Han Yuan, Benjamin Alan Goldstein, Marcus Eng Hock Ong, Nan Liu, Bibhas Chakraborty

Abstract: Scoring systems are highly interpretable and widely used to evaluate time-to-event outcomes in healthcare research. However, existing time-to-event scores are predominantly created ad-hoc using a few manually selected variables based on clinician's knowledge, suggesting an unmet need for a robust and efficient generic score-generating method. AutoScore was previously developed as an interpretabl… ▽ More Scoring systems are highly interpretable and widely used to evaluate time-to-event outcomes in healthcare research. However, existing time-to-event scores are predominantly created ad-hoc using a few manually selected variables based on clinician's knowledge, suggesting an unmet need for a robust and efficient generic score-generating method. AutoScore was previously developed as an interpretable machine learning score generator, integrated both machine learning and point-based scores in the strong discriminability and accessibility. We have further extended it to time-to-event data and developed AutoScore-Survival, for automatically generating time-to-event scores with right-censored survival data. Random survival forest provides an efficient solution for selecting variables, and Cox regression was used for score weighting. We illustrated our method in a real-life study of 90-day mortality of patients in intensive care units and compared its performance with survival models (i.e., Cox) and the random survival forest. The AutoScore-Survival-derived scoring model was more parsimonious than survival models built using traditional variable selection methods (e.g., penalized likelihood approach and stepwise variable selection), and its performance was comparable to survival models using the same set of variables. Although AutoScore-Survival achieved a comparable integrated area under the curve of 0.782 (95% CI: 0.767-0.794), the integer-valued time-to-event scores generated are favorable in clinical applications because they are easier to compute and interpret. Our proposed AutoScore-Survival provides an automated, robust and easy-to-use machine learning-based clinical score generator to studies of time-to-event outcomes. It provides a systematic guideline to facilitate the future development of time-to-event scores for clinical applications. △ Less

Submitted 13 June, 2021; originally announced June 2021.

arXiv:2105.14677 [pdf, other]

Characterization of Generalizability of Spike Timing Dependent Plasticity trained Spiking Neural Networks

Authors: Biswadeep Chakraborty, Saibal Mukhopadhyay

Abstract: A Spiking Neural Network (SNN) is trained with Spike Timing Dependent Plasticity (STDP), which is a neuro-inspired unsupervised learning method for various machine learning applications. This paper studies the generalizability properties of the STDP learning processes using the Hausdorff dimension of the trajectories of the learning algorithm. The paper analyzes the effects of STDP learning models… ▽ More A Spiking Neural Network (SNN) is trained with Spike Timing Dependent Plasticity (STDP), which is a neuro-inspired unsupervised learning method for various machine learning applications. This paper studies the generalizability properties of the STDP learning processes using the Hausdorff dimension of the trajectories of the learning algorithm. The paper analyzes the effects of STDP learning models and associated hyper-parameters on the generalizability properties of an SNN. The analysis is used to develop a Bayesian optimization approach to optimize the hyper-parameters for an STDP model for improving the generalizability properties of an SNN. △ Less

Submitted 9 August, 2021; v1 submitted 30 May, 2021; originally announced May 2021.

Comments: 23 pages, submitted to Frontiers in Neuroscience. arXiv admin note: text overlap with arXiv:2010.08195, arXiv:2006.09313 by other authors

arXiv:2104.10719 [pdf, other]

doi 10.1109/TIP.2021.3122092

A Fully Spiking Hybrid Neural Network for Energy-Efficient Object Detection

Authors: Biswadeep Chakraborty, Xueyuan She, Saibal Mukhopadhyay

Abstract: This paper proposes a Fully Spiking Hybrid Neural Network (FSHNN) for energy-efficient and robust object detection in resource-constrained platforms. The network architecture is based on Convolutional SNN using leaky-integrate-fire neuron models. The model combines unsupervised Spike Time-Dependent Plasticity (STDP) learning with back-propagation (STBP) learning methods and also uses Monte Carlo D… ▽ More This paper proposes a Fully Spiking Hybrid Neural Network (FSHNN) for energy-efficient and robust object detection in resource-constrained platforms. The network architecture is based on Convolutional SNN using leaky-integrate-fire neuron models. The model combines unsupervised Spike Time-Dependent Plasticity (STDP) learning with back-propagation (STBP) learning methods and also uses Monte Carlo Dropout to get an estimate of the uncertainty error. FSHNN provides better accuracy compared to DNN based object detectors while being 150X energy-efficient. It also outperforms these object detectors, when subjected to noisy input data and less labeled training data with a lower uncertainty error. △ Less

Submitted 23 July, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

Comments: 10 pages, Submitted Manuscript

arXiv:2101.11451 [pdf, other]

Controlling by Showing: i-Mimic: A Video-based Method to Control Robotic Arms

Authors: Debarati B. Chakraborty, Mukesh Sharma, Bhaskar Vijay

Abstract: A novel concept of vision-based intelligent control of robotic arms is developed here in this work. This work enables the controlling of robotic arms motion only with visual inputs, that is, controlling by showing the videos of correct movements. This work can broadly be sub-divided into two segments. The first part of this work is to develop an unsupervised vision-based method to control robotic… ▽ More A novel concept of vision-based intelligent control of robotic arms is developed here in this work. This work enables the controlling of robotic arms motion only with visual inputs, that is, controlling by showing the videos of correct movements. This work can broadly be sub-divided into two segments. The first part of this work is to develop an unsupervised vision-based method to control robotic arm in 2-D plane, and the second one is with deep CNN in the same task in 3-D plane. The first method is unsupervised, where our aim is to perform mimicking of human arm motion in real-time by a manipulator. We developed a network, namely the vision-to-motion optical network (DON), where the input should be a video stream containing hand movements of human, the the output would be out the velocity and torque information of the hand movements shown in the videos. The output information of the DON is then fed to the robotic arm by enabling it to generate motion according to the real hand videos. The method has been tested with both live-stream video feed as well as on recorded video obtained from a monocular camera even by intelligently predicting the trajectory of human hand hand when it gets occluded. This is why the mimicry of the arm incorporates some intelligence to it and becomes intelligent mimic (i-mimic). Alongside the unsupervised method another method has also been developed deploying the deep neural network technique with CNN (Convolutional Neural Network) to perform the mimicking, where labelled datasets are used for training. The same dataset, as used in the unsupervised DON-based method, is used in the deep CNN method, after manual annotations. Both the proposed methods are validated with off-line as well as with on-line video datasets in real-time. The entire methodology is validated with real-time 1-link and simulated n-link manipulators alongwith suitable comparisons. △ Less

Submitted 27 January, 2021; originally announced January 2021.

arXiv:2101.08463 [pdf]

COLLIDE-PRED: Prediction of On-Road Collision From Surveillance Videos

Authors: Deesha Chavan, Dev Saad, Debarati B. Chakraborty

Abstract: Predicting on-road abnormalities such as road accidents or traffic violations is a challenging task in traffic surveillance. If such predictions can be done in advance, many damages can be controlled. Here in our wok, we tried to formulate a solution for automated collision prediction in traffic surveillance videos with computer vision and deep networks. It involves object detection, tracking, tra… ▽ More Predicting on-road abnormalities such as road accidents or traffic violations is a challenging task in traffic surveillance. If such predictions can be done in advance, many damages can be controlled. Here in our wok, we tried to formulate a solution for automated collision prediction in traffic surveillance videos with computer vision and deep networks. It involves object detection, tracking, trajectory estimation, and collision prediction. We propose an end-to-end collision prediction system, named as COLLIDE-PRED, that intelligently integrates the information of past and future trajectories of moving objects to predict collisions in videos. It is a pipeline that starts with object detection, which is used for object tracking, and then trajectory prediction is performed which concludes by collision detection. The probable place of collision, and the objects those may cause the collision, both can be identified correctly with COLLIDE-PRED. The proposed method is experimentally validated with a number of different videos and proves to be effective in identifying accident in advance. △ Less

Submitted 21 January, 2021; originally announced January 2021.

arXiv:2009.01368 [pdf, other]

doi 10.1109/JIOT.2021.3051480

Cost-aware Feature Selection for IoT Device Classification

Authors: Biswadeep Chakraborty, Dinil Mon Divakaran, Ido Nevat, Gareth W. Peters, Mohan Gurusamy

Abstract: Classification of IoT devices into different types is of paramount importance, from multiple perspectives, including security and privacy aspects. Recent works have explored machine learning techniques for fingerprinting (or classifying) IoT devices, with promising results. However, existing works have assumed that the features used for building the machine learning models are readily available or… ▽ More Classification of IoT devices into different types is of paramount importance, from multiple perspectives, including security and privacy aspects. Recent works have explored machine learning techniques for fingerprinting (or classifying) IoT devices, with promising results. However, existing works have assumed that the features used for building the machine learning models are readily available or can be easily extracted from the network traffic; in other words, they do not consider the costs associated with feature extraction. In this work, we take a more realistic approach, and argue that feature extraction has a cost, and the costs are different for different features. We also take a step forward from the current practice of considering the misclassification loss as a binary value, and make a case for different losses based on the misclassification performance. Thereby, and more importantly, we introduce the notion of risk for IoT device classification. We define and formulate the problem of cost-aware IoT device classification. This being a combinatorial optimization problem, we develop a novel algorithm to solve it in a fast and effective way using the Cross-Entropy (CE) based stochastic optimization technique. Using traffic of real devices, we demonstrate the capability of the CE based algorithm in selecting features with minimal risk of misclassification while keeping the cost for feature extraction within a specified limit. △ Less

Submitted 21 April, 2021; v1 submitted 2 September, 2020; originally announced September 2020.

Comments: 33 pages, 9 figures

Journal ref: Internet of Things Journal 2021

arXiv:1903.08272 [pdf]

Design and Evaluation of Self-Assembled Actin-Based Nano-Communication

Authors: Oussama Abderrahmane Dambri, Soumaya Cherkaoui, Biswadeep Chakraborty

Abstract: The tremendous progress in nanotechnology over the last century, makes it possible to engineer tiny nanodevices, which they need a nano-communication network to interact. Two solutions are proposed in literature to create a nano-communication system, either by using the classical electromagnetic paradigm with Terahertz band, or using the bio-inspired molecular communication. However, Terahertz is… ▽ More The tremendous progress in nanotechnology over the last century, makes it possible to engineer tiny nanodevices, which they need a nano-communication network to interact. Two solutions are proposed in literature to create a nano-communication system, either by using the classical electromagnetic paradigm with Terahertz band, or using the bio-inspired molecular communication. However, Terahertz is suffering from molecular absorption and scattering losses at nano level, and the achievable throughput of molecular communication is very low. In this paper, we propose a new solution to establish a wired nano-communication. Self-assembled actin-based is a new method that takes advantage of actin filaments self-assembly to create a nano wire between a transmitter and a receiver, and use electrons as information carriers. VPython framework is used in this paper to perform stochastic simulations of the nano wire formation. The algorithms used for the simulations are presented. The stability of the constructed nano wire is analyzed, and the error probability is calculated. Self-assembled actin-based method promises a fast and stable nano-communication system with a very high achievable throughput. △ Less

Submitted 19 March, 2019; originally announced March 2019.

arXiv:1610.09974 [pdf]

doi 10.1016/j.asej.2016.01.012

A review on application of data mining techniques to combat natural disasters

Authors: Saptarsi Goswami, Sanjay Chakraborty, Sanhita Ghosh, Amlan Chakrabarti, Basabi Chakraborty

Abstract: Thousands of human lives are lost every year around the globe, apart from significant damage on property, animal life, etc., due to natural disasters (e.g., earthquake, flood, tsunami, hurricane and other storms, landslides, cloudburst, heat wave, forest fire). In this paper, we focus on reviewing the application of data mining and analytical techniques designed so far for (i) prediction, (ii) det… ▽ More Thousands of human lives are lost every year around the globe, apart from significant damage on property, animal life, etc., due to natural disasters (e.g., earthquake, flood, tsunami, hurricane and other storms, landslides, cloudburst, heat wave, forest fire). In this paper, we focus on reviewing the application of data mining and analytical techniques designed so far for (i) prediction, (ii) detection, and (iii) development of appropriate disaster management strategy based on the collected data from disasters. A detailed description of availability of data from geological observatories (seismological, hydrological), satellites, remote sensing and newer sources like social networking sites as twitter is presented. An extensive and in-depth literature study on current techniques for disaster prediction, detection and management has been done and the results are summarized according to various types of disasters. Finally a framework for building a disaster management database for India hosted on open source Big Data platform like Hadoop in a phased manner has been proposed. The study has special focus on India which ranks among top five counties in terms of absolute number of the loss of human life. △ Less

Submitted 31 October, 2016; originally announced October 2016.

Showing 1–39 of 39 results for author: Chakraborty, B